OXFORD 


Introduction to 
Modern Analysis 


Second Edition 


Shmuel Kantorovitz and Ami Viselter 


OXFORD GRADUATE TEXTS IN MATHEMATICS | 29 


Introduction to Modern Analysis 


Oxford Graduate Texts in Mathematics 
Series Editors 


R. Cohen’ S. K. Donaldson 
T. J. Lyons M. J. Taylor 


OXFORD GRADUATE TEXTS IN MATHEMATICS 


wn Pr 


Keith Hannabuss: An Introduction to Quantum Theory 

Reinhold Meise and Dietmar Vogt: Introduction to Functional Analysis 

James G. Oxley: Matroid Theory 

N. J. Hitchin, G. B. Segal, and R. S. Ward: Integrable Systems: Twistors, Loop 
Groups, and Riemann Surfaces 

Wulf Rossmann: Lie Groups: An Introduction Through Linear Groups 

Qing. Liu: Algebraic Geometry and Arithmetic Curves 

Martin R. Bridson and Simon M, Salamon (eds): Invitations to Geometry and Topology 
Shmuel Kantorovitz: Introduction to Modern Analysis 

Terry Lawson: Topology: A Geometric Approach 

Meinolf Geck: An Introduction to Algebraic Geometry and Algebraic Groups 
Alastair Fletcher and Vladimir Markovic: Quasiconformal Maps and Teichmiiller 
Theory 

Dominic Joyce: Riemannian Holonomy Groups and Calibrated Geometry 
Fernando Villegas: Experimental Number Theory 

Péter Medvegyev: Stochastic Integration Theory 

Martin A. Guest: From Quantum Cohomology to Integrable Systems 

Alan D. Rendall: Partial Differential Equations in General Relativity 

Yves Félix, John Oprea, and Daniel Tanré: Algebraic Models in Geometry 

Jie Xiong: Introduction to Stochastic Filtering Theory 

Maciej Dunajski: Solitons, Instantons, and Twistors 

Graham R. Allan: Introduction to Banach Spaces and Algebras 

James Oxley: Matroid Theory, Second Edition 

Simon Donaldson: Riemann Surfaces 

Clifford Henry Taubes: Differential Geometry: Bundles, Connections, Metrics and 
Curvature 

Gopinath Kallianpur and P. Sundar: Stochastic Analysis and Diffusion Processes 
Selman Akbulut: 4-Manifolds 

Fon-Che Liu: Real Analysis 

Dusa McDuff and Dietmar Salamon: Introduction to Symplectic Topology, Third 
Edition 

Chris Heunen, Jamie Vicary: Categories for Quantum Theory: An Introduction 
Shmuel Kantorovitz, Ami Viselter: Introduction to Modern Analysis, Second Edition 


Introduction to 
Modern Analysis 


Second Edition 


Shmuel Kantorovitz 


Bar Ilan University, Israel 


Ami Viselter 


University of Haifa, Israel 


OXFORD 


UNIVERSITY PRESS 


OXFORD 


UNIVERSITY PRESS 


Great Clarendon Street, Oxford, OX2 6DP, 
United Kingdom 


Oxford University Press is a department of the University of Oxford. 
It furthers the University’s objective of excellence in research, scholarship, 
and education by publishing worldwide. Oxford is a registered trade mark of 
Oxford University Press in the UK and in certain other countries 


© Shmuel Kantorovitz and Ami Viselter 2022 
The moral rights of the authors have been asserted 
Impression: 1 


All rights reserved. No part of this publication may be reproduced, stored in 
a retrieval system, or transmitted, in any form or by any means, without the 
prior permission in writing of Oxford University Press, or as expressly permitted 
by law, by licence or under terms agreed with the appropriate reprographics 
rights organization. Enquiries concerning reproduction outside the scope of the 
above should be sent to the Rights Department, Oxford University Press, at the 
address above 


You must not circulate this work in any other form 
and you must impose this same condition on any acquirer 


Published in the United States of America by Oxford University Press 
198 Madison Avenue, New York, NY 10016, United States of America 


British Library Cataloguing in Publication Data 
Data available 


Library of Congress Control Number: 2022933188 


ISBN 978-0-19-284954-0 (hbk) 
ISBN 978-0-19-284955-7 (pbk) 


DOI: 10.1093/0s0/9780192849540.001.0001 


Printed and bound by 
CPI Group (UK) Ltd, Croydon, CRO 4YY 


Links to third party websites are provided by Oxford in good faith and 
for information only. Oxford disclaims any responsibility for the materials 
contained in any third party website referenced in this work. 


To Ita, Bracha, Pnina, Pinchas, Ruth, and Lilach 


Contents 


Preface to the First Edition 
Preface to the Second Edition 


1 Measures 


1.1 Measurable sets and functions 

1.2. Positive measures 

1.3 Integration of non-negative measurable functions 
1.4 Integrable functions 

1.5 D?-spaces 

1.6 Inner product 

1.7 Hilbert space: a first look 

1.8 The Lebesgue-Radon—Nikodym theorem 
1.9 Complex measures 

1.10 Convergence 

1.11 Convergence on finite measure space 
1.12 Distribution function 

1.13 Truncation 

Exercises 


2 Construction of measures 


2.1 Semi-algebras 

2.2 Outer measures 

2.3 Extension of measures on algebras 

2.4 Structure of measurable sets 

2.5 Construction of Lebesgue-Stieltjes measures 
2.6 Riemann vs. Lebesgue 

2.7 Product measure 

Exercises 


3 Measure and topology 


3.1 
3.2 
3.3 
3.4 


Partition of unity 

Positive linear functionals 

The Riesz—Markov representation theorem 
Lusin’s theorem 


Xvil 


Contents 


3.5 The support of a measure 97 
3.6 Measures on R*; differentiability 97 
Exercises 101 
Continuous linear functionals 107 
4.1 Linear maps 108 
4.2 The conjugates of Lebesgue spaces 110 
4.3 The conjugate of C.(X) 114 
4.4 The Riesz representation theorem 116 
4.5 Haar measure 118 
Exercises 126 
Duality 129 
5.1 The Hahn—Banach theorem 130 
5.2 Reflexivity 134 
5.3 Separation 137 
5.4 Topological vector spaces 140 
5.5 Weak topologies 143 
5.6 Extremal points 146 
5.7 The Stone-Weierstrass theorem 150 
5.8 Operators between Lebesgue spaces: Marcinkiewicz’s 

interpolation theorem 152 
5.9 Fixed points 157 
5.10 The bounded weak*-topology 165 
Exercises 169 
Bounded operators 1f3 
6.1 Category 174 
6.2 The uniform boundedness theorem 175 
6.3 The open mapping theorem 17 
6.4 Graphs 179 
6.5 Quotient space 181 
6.6 Operator topologies 182 
Exercises 184 
Banach algebras 193 
7.1 Basics 194 
7.2 Commutative Banach algebras 203 
7.3 Involutions and C*-algebras 207 
7.4 Normal elements 211 
7.5 The Arens products 212 
Exercises 215 
Hilbert spaces 225 
8.1 Orthonormal sets 225 


8.2 Projections 228 


Contents 


10 


11 


8.3 Orthonormal bases 

8.4 Hilbert dimension 

8.5  Isomorphism of Hilbert spaces 

8.6 Direct sums 

8.7 Canonical model 

8.8 Tensor products 
8.8.1 An interlude: tensor products of vector spaces 
8.8.2 Tensor products of Hilbert spaces 

Exercises 


Integral representation 


9.1 Spectral measure on a Banach subspace 

9.2 Integration 

9.3 Case Z=X 

9.4 The spectral theorem for normal operators 

9.5 Parts of the spectrum 

9.6 Spectral representation 

9.7  Renorming method 

9.8 Semi-simplicity space 

9.9 Resolution of the identity on Z 

9.10 Analytic operational calculus 

9.11 Isolated points of the spectrum 

9.12 Compact operators 

Exercises 

Unbounded operators 

10.1 Basics 

10.2 The Hilbert adjoint 

10.3 The spectral theorem for unbounded selfadjoint 
operators 

10.4 The operational calculus for unbounded selfadjoint 
operators 

10.5 The semi-simplicity space for unbounded operators in Banach 
space 

10.6 Symmetric operators in Hilbert space 

10.7 Quadratic forms 

Exercises 


C*-algebras 


11.1 
11.2 
11.3 
11.4 
11.5 
11.6 


Notation and examples 

The continuous operational calculus continued 
Positive elements 

Approximate identities 

Ideals 

Positive linear functionals 


Xl 


231 
234 
235 
236 
237 
237 
237 
240 
242 


203 
254 
255 
257 
260 
262 
264 
265 
267 
270 
274 
27t 
279 
Bez 


289 
290 
293 


296 


298 


300 
303 
307 
all 


323 
324 
325 
327 
332 
333 
335 


xii 


11.7 


11.8 


Representations and the Gelfand—Naimark—Segal 
construction 

11.7.1 Irreducible representations 

Positive linear functionals and convexity 

11.8.1 Pure states 

11.8.2 Decompositions of functionals 


Exercises 


12 Von Neumann algebras 


12.1 
12.2 
12.3 
12.4 
12.5 
12.6 
12.7 
12.8 


Preliminaries 

Commutants 

Density 

The polar decomposition 

W*-algebras 

Hilbert-Schmidt and trace-class operators 
Commutative von Neumann algebras 

The enveloping von Neumann algebra of a C*-algebra 


Exercises 


13 Constructions of C*-algebras 


13.1 


13.2 


Tensor products of C*-algebras 

13.1.1 Tensor products of algebras 

13.1.2 Tensor products of C*-algebras through 
representations 

13.1.3 The maximal tensor product 

13.1.4 Tensor products of bounded linear functionals 

13.1.5 The minimal tensor product 

13.1.6 Tensor products by commutative C*-algebras 

Group C*-algebras 

13.2.1 Unitary representations 

13.2.2 The definition and representations of the group 
C*-algebra 

13.2.3 Properties of the group C*-algebra 


Exercises 


Application I Probability 


Ll 
12 


L3 
L4 
L5 
16 


Heuristics 

Probability space 

1.2.1 L?-random variables 

Probability distributions 

Characteristic functions 

Vector-valued random variables 
Estimation and decision 

1.6.1 Confidence intervals 

1.6.2 Testing of hypothesis and decision 
16.3 Tests based on a statistic 


Contents 


340 
345 
346 
346 
349 
390 


309 
306 
309 
362 
363 
365 
368 
375 
376 
379 


383 
383 
384 


385 
390 
390 
393 
396 
397 
398 


400 
401 
403 


409 
409 
411 
414 
A24 
433 
441 
450 
455 
457 
460 


Contents 


1.7 Conditional probability 
1.7.1 Heuristics 
1.7.2 Conditioning by an r.v. 
1.8 Series of L? random variables 
1.9 Infinite divisibility 
1.10 More on sequences of random variables 
Application II Distributions 
II.1 Preliminaries 
II.2 Distributions 
IIl.3 Temperate distributions 
II.3.1 The spaces Wp x 
Il.4 Fundamental solutions 
II.5 Solution in €’ 
IL.6 Regularity of solutions 
II.7 Variable coefficients 
II.8 Convolution operators 
II.9 Some holomorphic semigroups 
Bibliography 
Index 


xii 


462 
462 
467 
475 
481 
485 


AQ1 
491 
493 
503 
514 
520 
523 
525 
528 
531 
542 


549 
593 


Preface to the First Edition 


This book grew out of lectures given since 1964 at Yale University, the 
University of Illinois at Chicago, and Bar Ilan University. The material 
covers the usual topics of Measure Theory and Functional Analysis, with 
applications to Probability Theory and to the theory of linear partial differential 
equations. Some relatively advanced topics are included in each chapter 
(excluding the first two): the Riesz-Markov representation theorem and 
differentiability in Euclidean spaces (Chapter 3); Haar measure (Chapter 4); 
Marcinkiewicz’s interpolation theorem (Chapter 5); the Gelfand—Naimark-—Segal 
representation theorem (Chapter 7); the von Neumann double commutant 
theorem (Chapter 8); the spectral representation theorem for normal operators 
(Chapter 9); the extension theory for unbounded symmetric operators 
(Chapter 10); the Lyapounov Central Limit theorem and the Kolmogoroff 
“Three Series theorem” (Application I); the Hormander—Malgrange theorem, 
fundamental solutions of linear partial differential equations with variable 
coefficients, and Hormander’s theory of convolution operators, with an 
application to integration of pure imaginary order (Application I). Some 
important complementary material is included in the ‘Exercises’ sections, with 
step-by-step detailed hints leading to the wanted results. Solutions to the end 
of chapter exercises may be found on the companion website for this text: 
http://www.oup.co.uk/academic/companion/mathematics/kantorovitz. 


Ramat Gan S. K. 
July 2002 


Preface to the Second 
Edition 


The purpose of the second edition is to make our Introduction to Modern Analysis 
more modern. We did this mostly by broadening and deepening the presentation 
of operator algebras, which form a central area in functional analysis. There are 
three new chapters: Chapter 11 on C*-algebras, Chapter 12 on von Neumann 
algebras, and Chapter 13 on constructions of C*-algebras. They contain much 
more material on these subjects than the first edition. These chapters are also 
more advanced than the previous parts of the book and require more from the 
reader, occasionally in the form of guided exercises. Nevertheless, what we give 
here is merely a taste of operator algebras. 

In addition, we made numerous corrections and added quite a lot of exercises. 
There are also new subjects of independent interest: fixed-point theorems 
(Chapter 5); the bounded weak*-topology (Chapter 5); the Arens products 
(Chapter 7); tensor products of vector spaces and of Hilbert spaces (Chapter 8); 
and quadratic forms (Chapter 10). 


Ramat Gan and Haifa S. K. and A. V. 
December 2021 


1 


Measures 


This chapter begins the study of measure theory, which spans Chapters 1—3 and 
most of Chapter 4. Let us explain first what necessitated this theory. 
Consider the set of all Riemann integrable functions on an interval [a, b]. It 


becomes a semi-normed space with respect to the semi-norm || f|| := ie | f(x)|da. 
The problem is that this space is not complete: it admits non-convergent Cauchy 
sequences. As discussed later in the book, in modern analysis it is especially 
important for (semi-) normed spaces to be complete. Measure theory, invented 
by H. L. Lebesgue, introduces the concept of a measure space, which is a triple 
(X, A, ), where X isaset, A is the o-algebra of all measurable subsets of X, and 
pu: A — [0,00] is a measure. To each such triple there is an associated Lebesgue 
integral. In the particular case when X = [a,b] and p is the Lebesgue measure 
(very roughly, p([c,d]) = d—c, which justifies the word “measure”), every 
Riemann integrable function is also Lebesgue integrable (but not conversely!) 
and the integrals coincide. 

One big virtue of Lebesgue integration is that the space of integrable functions 
that comes out of it is complete. In fact, to every measure space we associate not 
one complete normed space, but a continuum of them—the so-called L?-spaces 
(p € [1,00]). 

The chapter is structured as follows. We first introduce positive measure 
spaces and Lebesgue integration on them and prove several convergence theorems 
that are fundamental in the theory. We then define the L?-spaces and prove that 
they are Banach spaces, that is, complete normed spaces. Next, we prove a few 
basic facts on Hilbert spaces culminating in the “little” Riesz representation 
theorem (we return to Hilbert spaces in Chapter 8). Hilbert spaces are needed 
in the proof of the Lebesgue-Radon—Nikodym theorem, a deep result about 
the relationship between two arbitrary measures. Complex measures are then 
introduced and studied. A few notions of convergence of sequences of measurable 
functions are defined and the relations between them are explained, including 
a surprising theorem of Egoroff saying that on finite measure spaces, pointwise 
convergence is “almost uniform”. A short treatment of the distribution function of 


Introduction to Modern Analysis. Second Edition. Shmuel Kantorovitz and Ami Viselter, Oxford University Press. 
© Shmuel Kantorovitz and Ami Viselter (2022). DOI: 10.1093/os0/9780192849540.003.0001 


2 1. Measures 


a measurable function follows. The chapter ends with the notion of a truncation 
of a function. 


1.1 Measurable sets and functions 


The setting of abstract measure theory is a family A of so-called measurable 
subsets of a given set X, and a function 


ue: A [0, oo], 


so that the measure u(E) of the set E € A has some “intuitively desirable” 
property, such as “countable additivity”: 


co [oe) 
LU (U B) = )_ ME), 
i=1 i=1 
for mutually disjoint sets E; € A. In order to make sense, this setting has to 
deal with a family A that is closed under countable unions. We then arrive to 
the concept of a measurable space. 


Definition 1.1. Let X be a (non-empty) set. A o-algebra of subsets of X 
(briefly, a o-algebra on X) is a subfamily A of the family P(X) of all subsets 
of X, with the following properties: 


(1) X € A; 
(2) if EF ¢ A, then the complement E* of E belongs to A; 
(3) if {£;} is a sequence of sets in A, then its union belongs to A. 


The ordered pair (X,A), with A a o-algebra on X, is called a measurable 
space. The sets of the family A are called measurable sets (or A-measurable sets) 
in X. 

Observe that by (1) and (2), the empty set @ belongs to the o-algebra A. 
Taking then £; = 0 for all i > n in (3), we see that A is closed under finite 
unions; if this weaker condition replaces (3), A is called an algebra of subsets 
of X (briefly, an algebra on X). 

By (2) and (3), and DeMorgan’s Law, A is closed under countable 
intersections (finite intersections, in the case of an algebra). In particular, any 
algebra on X is closed under differences E — F := EN F*. 

The intersection of an arbitrary family of o-algebras on X is a o-algebra 
on X. If all the o-algebras in the family contain some fixed collection € Cc P(X), 
the said intersection is the smallest o-algebra on X (with respect to set inclusion) 
that contains €; it is called the o-algebra generated by E, and is denoted by [€]. 

An important case comes up naturally when X is a topological space (for 
some topology 7). The o-algebra [7] generated by the topology is called the 
Borel (c)-algebra [denoted 6(X)], and the sets in B(X) are the Borel sets in X. 
For example, the countable intersection of t-open sets (a so-called G'5-set) and 
the countable union of r-closed sets (a so-called F,-set) are Borel sets. 


1.1. Measurable sets and functions 3 


Definition 1.2. Let (X,A) and (Y, 8) be measurable spaces. A map f : X > Y 
is measurable if for each B € B, the set 


f-'(B) = {a € X; f(z) ¢ B} = [f € B 
belongs to A. 


A constant map f(x) =p € Y is trivially measurable, since [f € B] is either 
0 or X (when p € BS and p € B, respectively), and so belongs to A. 

When Y is a topological space, we shall usually take 6 = B(Y), the Borel 
algebra on Y. In particular, for Y = R (the real line), Y = [—o0, co] (the 
“extended real line”), or Y = C (the complex plane), with their usual topologies, 
we shall call the measurable map a measurable function (more precisely, an 
A-measurable function). If X is a topological space, a B(X)-measurable map 
(function) is called a Borel map (function). 

Given a measurable space (X,.A) and a map f : X — Y, for an arbitrary set 
Y, the family 

By = {F ¢ P(Y); f-*(F) € A} 


is a o-algebra on Y (because the inverse image operation preserves the set 
theoretical operations: f~'(U, Fa) = U, f~'(£a), ete.), and it is the largest 
o-algebra on Y for which f is measurable. 

If Y is a topological space, and f~!(V) € A for every open V, then By 
contains the topology 7, and so contains 6(Y); that is, f is measurable. Since 
Tt C B(Y), the converse is trivially true. 


Lemma 1.3. A map f from a measurable space (X,A) to a topological space Y 
is measurable if and only if f-'(V) € A for every openV CY. 


In particular, if X is also a topological space, and A = B(X), it follows that 
every continuous map f : X — Y is a Borel map. 


Lemma 1.4. A map f from a measurable space (X,A) to [—o0, 00] is measurable 
if and only if 
[f>cdeA 


for all real c. 


The non-trivial direction in the lemma follows from the fact that (c, oo] € By 
by hypothesis for all real c; therefore, the o-algebra By contains the sets 


n=1 n=1 
and (a,b) = [—co, b) N (a, 00] for every real a < 6, and so contains all countable 


unions of “segments” of the above type, that is, all open subsets of [—o00, co]. 

The sets [f > c] in the condition of Lemma 1.4 can be replaced by any of 
the sets [f > c],[f < c, or [f < ¢ (for all real c), respectively. The proofs are 
analogous. 


4 1. Measures 


For f : X — [-oo,oo] measurable and a real, the function af (defined 
pointwise, with the usual arithmetics a@-oo = co for a > 0,= 0 for a = 0, 
and = —oo for a < 0, and similarly for —oo) is measurable, because for all real 
c, [af > c¢] =[f > c/a] for a > 0,= [f < c/a] for a < 0, and af is constant for 
a=0. 

If {a,} C [—co, 00], one denotes the superior (inferior) limit, that is, the 
“largest” (“smallest”) limit point, of the sequence by limsupa, (liminf ap, 
respectively). 

Let by, := sup;,s,, ax. Then {b,} is a decreasing sequence, and therefore 


dlim b,, = inf bp. 


Let a := limsupa, and 6 = limb, For any given n € N, az < by, for all k > n, 
and therefore a < b,. Hence a < £. 

On the other hand, for any t > a, a, > t for at most finitely many indices k. 
Therefore, there exists no such that a, < ¢ for all k > no, hence by, < t. But 
then b, < t for all n > no (because {b,} is decreasing), and so 6 < t. Since 
t > a was arbitrary, it follows that 8 < a, and the conclusion a = 6 follows. We 
showed 

lim sup a, = lim (sep a) = inf (sep a). (1) 
n k>n neN k>n 


Similarly 
lim infa, = lim (jn ar) = sup (jn a). (2) 
n k>n neN \k2n 
Lemma 1.5. Let {f,} be a sequence of measurable [—oo, 0oj-valued functions 


on the measurable space (X,A). Then the functions sup fy, inf fn, limsup fp, lim 
inf f,, and lim f, (when it exists), all defined pointwise, are measurable. 


Proof. Let h = sup f,. Then for all real c, 


[h>d= lf >deA, 
so that h is measurable by Lemma 1.4. 
As remarked, —f, = (—1)fn are measurable, and therefore inf f,, = —sup 
(—fn) is measurable. 
The proof is completed by the relations (1), (2), and 


lim f, = limsup f, = lim inf f,,, 


when the second equality holds (i.e. if and only if lim f,, exists). 


In particular, taking a sequence with f, = f, for all k > n, we see 
that max{f;,..., fn} and min{f;,...,f,} are measurable, when f;,...,/, are 
measurable functions into [—oo, co]. For example, the positive (negative) parts 
ft := max{f,0} (f~ := —min{f,0}) of a measurable function f : X — [—oo, co] 


are (non-negative) measurable functions. Note the decompositions 


fap af yy Fear ar. 


1.1. Measurable sets and functions 5 


Lemma 1.6. Let (X,A), (Y,B) and (Z,C) be measurable spaces. If f : X > Y 
andg:Y — Z are measurable, then so is the composite function h:=gof: 
X->Z. 


Indeed, for every C € C we have g~!(C) € B by measurability of g, thus 
h-*(C) = f-'(g-1(C)) € A by measurability of f. 

In particular, if Y,Z are topological spaces and g : Y — Z is continuous, 
then go f is measurable. 

If 


Y= x 
k=1 


is the product space of topological spaces Y;, the projections p, : Y — Yz are 
continuous. Therefore, if f : X — Y is measurable, so are the “component 
functions” f,(@) := pre(f(x)) : X — Y, (Kk = 1,...,n), by Lemma 1.6. 
Conversely, if the topologies on Y; have countable bases (for all &), a countable 
base for the topology of Y consists of sets of the form V = [];_, Ve with V;, 
varying in a countable base for the topology of Y; (for each k). Now, 


n 


[fe Vl =(V[feeVileA 


k=1 


if all f, are measurable. Since every open W C Y is a countable union of sets of 
the above type, [f € W] € A, and f is measurable. We proved: 


Lemma 1.7. Let Y be the Cartesian product of topological spaces Y,,...,Y¥n 
with countable bases to their topologies. Let (X,A) be a measurable space. Then 
f:X—- Y is measurable iff the components fy are measurable for all k. 


For example, if f, : X — C are measurable for k = 1,...,n, then f := 
(fi,---;fn) : X — C” is measurable, and since g(21,...,2n) := Narpze (az € C) 
and h(z1,...,2n) = 21---2n are continuous from C” to C, it follows from 


Lemma 1.6 that (finite) linear combinations and products of complex measurable 
functions are measurable. Thus, the complex measurable functions form an 
algebra over the complex field (similarly, the real measurable functions form 
an algebra over the real field), for the usual pointwise operations. 

If f has values in R, [—co, oo], or C, its measurability implies that of |f|, by 
Lemma 1.6. 

By Lemma 1.7, a complex function is measurable iff its real part Rf and 
imaginary part Sf are both measurable. 

If f,g are measurable with values in [0,00], the functions f +g and fg are 
well-defined pointwise (with values in [0,0o]) and measurable, since the functions 
(s,t) > s +t and (s,t) > st from [0,00]? to [0,00] are Borel (cf. Lemmas 1.6 
and 1.7). 

The function f : X > C is simple if its range is a finite set {c1,...,¢n} CC. 
Let Ex, := [f = cy], & = 1,...,n. Then X is the disjoint union of the sets 
Ex, and 


6 1. Measures 


pa ling 
k=1 


where Ig denotes the indicator of E (also called the characteristic function of 
FE by non-probabilists, while probabilists reserve the later name to a different 
concept): 

Ip(x)=1 forxeE and =0 forrve EF. 


Since a singleton {c} C C is closed, it is a Borel set. Suppose now that the 
simple (complex) function f is defined on a measurable space (X,A). If f is 
measurable, then Ey := [f = cx] is measurable for all k = 1,...,n. Conversely, 
if all E;, are measurable, then for each open V C C, 


{k;cn.EV} 


so that f is measurable. In particular, an indicator Ig is measurable iff E € A. 
Let B(X,A) denote the complex algebra of all bounded complex 
A-measurable functions on X (for the pointwise operations), and denote 


NF =eop| 4) We BA): 


The map f — ||f|| of BCX,.A) into [0, 00) has the following properties: 


1 
2 
3 
4 


|| f|| = 0 iff f =0 (the zero function); 

laf || = |a| || f|| for all ae C and f € B(X, A); 
If +l] < IF ll +Ilgl] for all f,g € BCX, A); 
fall < IIFIl llgll for all f,g € BCX, A). 


(1) 
(2) 
(3) 
(4) 


For example, (3) is verified by observing that for all x € X, 
IF(@) + 9(@)| < F@)| + l9@)I S sup |f| + sup |g. 


A map ||- || from any (complex) vector space Z to [0,0o) with Properties (1)-(3) 
is called a norm on Z. The previous example is the supremum norm or uniform 
norm on the vector space Z = B(X, A). Property (1) is the definiteness of the 
norm; Property (2) is its homogeneity; Property (3) is the triangle inequality. A 
vector space with a specified norm is a normed space. If Z is an algebra, and the 
specified norm satisfies Property (4) also, Z is called a normed algebra. Thus, 
B(X, A) is a normed algebra with respect to the supremum norm. Any normed 
space Z is a metric space for the metric induced by the norm 


d(u,v) := |lju—v|| u,v € Z. 


Convergence in Z is convergence with respect to this metric (unless stated 
otherwise). Thus, convergence in the normed space B(X, A) is precisely uniform 
convergence on X (this explains the name “uniform norm”). 


1.1. Measurable sets and functions 7 


If x,y € Z, the triangle inequality implies ||x|| = ||(«—y)+y|| < ||e—yl|+|lyll, 
so that ||2|| — |ly|| < ||z — y||. Since we may interchange x and y, we have 


all — [yl < le — gl. 


In particular, the norm function is continuous on Z. 
The simple functions in B(X,A) form a subalgebra Bo(X, A); it is dense in 
B(X, A): 


Theorem 1.8 (Approximation theorem). Let (X,A) be a measurable space. 
Then: 


(1) Bo(X,A) is dense in B(X,A) (i.e., every bounded complex measurable 
function is the uniform limit of a sequence of simple measurable complex 
functions). 

(2) If f : X — [0, co] is measurable, then there exists a sequence of measurable 


simple functions 
OS be Gok iow fF 


such that f =lim@n. 
Proof. (1) Since any f € B(X,A) can be written as 
faut -—u +ivt —iv™ 


with u= Rf and v = Sf, it suffices to prove (1) for f with range in [0, 00). Let 
N be the first integer such that N > sup f. For n = 1,2,..., set 


where 


The simple functions ¢, are measurable, 


0<d.<@<:--<f, 


and 
1 


< —~ Yn Qn? 
0<f-¢ er 


so that indeed || f — ¢,|| < (1/2”), as wanted. 
If f has range in [0, oo], set 


n2” 


k-1 
on i= S- 5a tpg Plies 
k=1 
where F,, := [f > n]. Again {¢,} is a non-decreasing sequence of non-negative 


measurable simple functions < f. If f(a) = oo for some x € X, then x € F, 


8 1. Measures 


for all n, and therefore ¢,(x) = n for all n; hence limy @n(%) = co = f(x). If 
f(a) < oo for some a, let n > f(x). Then there exists a unique k,1 < k < n2”, 
such that « € Ey. Then ¢,(@) = ((k — 1)/2") while ((k — 1)/2") < f(a) < 
(k/2”), so that 

O< f(t) — dn(#) < 1/2" (n> fle). 


Hence f(x) = lim, ¢,(x) for all « € X. 


1.2 Positive measures 


Definition 1.9. Let (X,A) be a measurable space. A (positive) measure on A 
is a function 


pu: A [0, 00] 
such that (0) = 0 and 
u( Ue) = ale) () 
k=1 k=1 


for any sequence of mutually disjoint sets E, © A. Property (1) is called 
o-additivity of the function . The ordered triple (X,A,,) will be called a 
(positive) measure space. 


Taking in particular E, = 0 for all k > n, it follows that 
u( Ue) = owe) (2) 
k=1 k=1 
for any finite collection of mutually disjoint sets E, € A,k =1,...,n. We refer 
to Property (2) by saying that pu is (finitely) additive. 
Any finitely additive function 4 > 0 on an algebra A is necessarily monotonic, 
that is, u(Z) < w(F) when E Cc F(E, F € A); indeed 
w(F) = wb U (F - B)) = p(2) + WF - £) 2 w(£). 
If pw(E) < 00, we get 
w(F — BE) = w(F) — w(£). 
Lemma 1.10. Let (X,A, 1) be a positive measure space, and let 
BE, C hy C E3C-::-: 


be measurable sets with union E. Then 


H(E) = lim p(En). 


1.2. Positive measures 9 


Proof. The sets E,, and FE can be written as disjoint unions 
Ey, = FE, U (E2 — FE) U (E3 — Ep) U+++U (En — En-1), 
E = FE, U (EB. — E,)U(E3 — Ea) U-:-, 


where all differences belong to A. Set Ey = Q. By o-additivity, 


p(B) = S° u(Ey — Ex-1) 
k=1 


n 


= lim} 0 (Ex — Ex—1) = lim w(Ey). 
k=1 


In general, if E; belong to an algebra A of subsets of X, set Ao = @ and 
An = Ujis Ej, » = 1,2,.... The sets Aj — Aj_i,1 < j < n, are disjoint A- 
measurable subsets of Ej; with union Ay. If ps is a non-negative additive set 
function on A, then 

w( U8) = ln) = Do olay — Aya) So HE) (" 
j=l j=l j=l 
This is the subadditivity property of non-negative additive set functions (on 
algebras). 

If Ais a o-algebra and ju is a positive measure on A, then since Ay C Ag C-::- 

and Ur, An = Ujl, Ej, letting n — oo in (*), it follows from Lemma 1.10 that 


This property of positive measures is called o-subadditivity. 
For decreasing sequences of measurable sets, the “dual” of Lemma 1.10 is 
false in general, unless we assume that the sets have finite measure: 


Lemma 1.11. Let {E;,} C A be a decreasing sequence (with respect to set- 
inclusion) such that u(E£1) < co. Let E =), Ex. Then 


MCE) = lim p(En). 


Proof. The sequence {/,—E;} is increasing, with union £;—E. By Lemma 1.10 
and the finiteness of the measures of E and E;, (subsets of £;!), 


w(Ey) — w(E) = (Ue FS #0) 


k 
= lim p(E, — Ey) = (£1) — lim p(E,), 


and the result follows by cancelling the finite number p(F;). 


10 1. Measures 


If {E,} is an arbitrary sequence of subsets of X, set F, = (),s,, Ex and 
Gn = Ups, Ex- Then {Fi} ({Gn}) is increasing (decreasing, respectively), and 
Fy, C Ey, C Gy for all n. 

One defines 


liminf Ey, — Bear lim sup EF, := () Gn. 


n n 


These sets belong to A if E, € A for all k. The set liminf E,, consists of all x 
that belong to E,, for all but finitely many n; the set limsup FE, consists of all x 
that belong to E,, for infinitely many n. By Lemma 1.10, 


p(liminf F,,) = lim w(F,) < liminf u(£,). (3) 
If the measure of G is finite, we also have by Lemma 1.11 


p(lim sup E,) = lim u(G,) > limsup p(E,,). (4) 


1.3 Integration of non-negative 
measurable functions 


Definition 1.12. Let (X,.A, 1) bea positive measure space, and ¢: X —> [0, co) 
a measurable simple function. The integral over X of ¢ with respect to uw, denoted 


[ou 


or briefly 
is the finite sum 


where 
¢= Socal, Ex =[¢= cal, 
k 
and cy are the distinct values of ¢. 
Note that 
[tedu=ue) EEA 


and 


0< foods lol] ((b # 0)). (1) 


1.3. Integration of non-negative measurable functions 11 


For an arbitrary measurable function f : X — [0, co], consider the (non-empty) 
set Sy of measurable simple functions ¢ such that 0 < @ < f, and define 


[fu = sup | ody (2) 


For any EF € A, the integral over E of f is defined by 


[fam [ tea (3) 


Let ¢,w~ be measurable simple functions; let cy,d; be the distinct values of @ 
and w, taken on the (mutually disjoint) sets E, and F;, respectively. Denote 
Q := {(k,j) € N?; EB, F; A O}. 

If ¢< y, then cy < d; for (k,j) € Q. Hence 


[eau= Dc Ex) = x cee Ey, Fj) 
SDE dyu(Be Fj) = D0 du Fj) = peu 
Jj 


(k,j)EQ 


Thus, the integral is monotonic on simple functions. 

If f is simple, then [ ddu < f f dy for all ¢ € S> (by monotonicity of the 
integral on simple functions), and therefore the supremum in (2) is less than 
or equal to the integral of f as a simple function; since f € Sy, the reverse 
inequality is trivial, so that the two definitions of the integral of f coincide for 
f simple. 

Since Scr = cSp := {co;¢ € Sf} for 0 < c < 00, we have (for f as above) 


pefdu=e f tay (oezcey (4) 


If f < g (f,g as above), Sp C Sg, and therefore [ f du < { gd (monotonicity 
of the integral with respect to the “integrand”). 

In particular, if £ C F (both measurable), then flz < fIr, and therefore 
Je fd < Jf, fdu (monotonicity of the integral with respect to the set of 
integration). 

If u(£) = 0, then any ¢ € Syz, assumes its non-zero values c, on the 
sets E, 1 E, that have measure 0 (as measurable subsets of /), and therefore 
J du =0 for all such ¢, hence f,, f du = 0. 

If f = 0 on E (for some E € A), then fIg is the zero function, hence has 
zero integral (by definition of the integral of simple functions!); this means that 
Jef du =0 when f =0 on E. 

Consider now the set function 


= | oan EcA, (5) 
E 


12 1. Measures 


for a fired simple measurable function ¢ > 0. As a special case of the preceding 
remark, v(0) = 0. Write 6 = )>c;,JIz,, and let A; € A be mutually disjoint 
(j = 1,2,...) with union A. Then 


dla = S0 celina, 


so that, by the o-additivity of uw and the possibility of interchanging summation 
order when the summands are non-negative, 


v(A) = So egp(Ex 0A) = So cx S~ w(x 1A;) 
k k J 
= So So ceu(Er NA;) = S- v(A;). 
j ok 


J 


Thus v is a positive measure. This is actually true for any measurable ¢ > 0 
(not necessarily simple), but this will be proved later. 

If w,x are simple functions as above (the distinct values of w and x 
being a1,...,@p and 6;,...,bg, assumed on the measurable sets F),..., Fp and 
Gj,...,Gq, respectively), then the simple measurable function ¢ := y+ x 
assumes the constant value a; + 6; on the set F;G;, and therefore, defining 
the measure v as shown, we have 


V(Fi G5) = (a; + 63) Wi 9 G5). (6) 


But a, and b, are the constant values of ~ and x on the set F; 1G; (respectively), 
so that the right-hand side of (6) equals p’(F;NG;) +v"(F,;NG;), where v’ and 
vy" are the measures defined as v, with the integrands 7 and y instead of ¢. 
Summing over all i,j, since X is the disjoint union of the sets Fi G;, the 
additivity of the measures v,v’, and v” implies that p(X) = v/(X) + v"(X), 
that is, 


[erndu= [vant frau. (7) 


Property (7) is the additivity of the integral over non-negative measurable 
simple functions. This property too is extended later to arbitrary non-negative 
measurable functions. 


Theorem 1.13. Let (X,A, 1) be a positive measure space. Let 
fi Sofa 3 fe S-* 2X = [0,00] 


be measurable, and denote f = lim f, (defined pointwise). Then 


[feu mim [tn du. (8) 


This is the Monotone Convergence theorem of Lebesgue. 


1.3. Integration of non-negative measurable functions 13 


Proof. By Lemma 1.5, f is measurable (with range in [0, co]). The monotonicity 
of the integral (and the fact that fn < fnsi < f) implies that 


indus f fords f fay, 


and therefore the limit in (8) exists (:= c € [0,00]) and the inequality > holds 
in (8). It remains to show the inequality < in (8). Let 0 <t< 1. Given ¢ € Sy, 
denote 


An = [td < fa] = [fn —t@ > 0) (n =1,2,...). 


Then A, € A and A; C Ap C-:: (because fi < fo <---). If a € X is such 
that ¢(x) = 0, then x € A, (for all n). If x € X is such that g(x) > 0, then 
f(x) > (a) > té(x), and there exists therefore n, for which f,(x) > td(a), that 
is, c € A, (for that n). This shows that U,, An = X. Consider the measure v 
defined by (5) (for the simple function td). By Lemma 1.10, 


tf edu =x) = lim v(An) =im f to du. 


However, td < fy, on An, so the integrals on the right are < fae fn du < i fn du 
(by the monotonicity property of integrals with respect to the set of integration). 
Therefore t [ dd <c, and so f[ ¢du < cby the arbitrariness of t € (0,1). Taking 
the supremum over all ¢ € Si, we conclude that f f dj < cas wanted. 


For arbitrary sequences of non-negative measurable functions we have the 
following inequality: 


Theorem 1.14 (Fatou’s lemma). Let f, : X — [0,c], mn = 1,2,..., be 
measurable. Then 


Jim inf fy du < lim int [ fin Ups. 
Proof. We have 
lim inf f, := lim (inf fr): 


Denote the infimum on the right by g,. Then g,,n = 1,2,..., are measurable, 


Gn < fn; 
O<m<g<-:-, 


and lim, gn = liminf, fn. By Theorem 1.13, 


[imine f dp = rman du = tim f gn dy. 


But the integrals on the right are < f f,du, therefore their limit is 
< lim inf fn dp. 


14 1. Measures 


Another consequence of Theorem 1.13 is the additivity of the integral of 
non-negative measurable functions. 


Theorem 1.15. Let f,g : X — [0,00] be measurable. Then 


[rrodu= f tau f oan. 


Proof. By the Approximation theorem (Theorem 1.8), there exist simple 
measurable functions ¢y, Wp, such that 


O<¢@i<do<..., limdr =f, 
O<deisws<..., limp, =. 


Then the measurable simple functions x7, = @n + Wn satisfy 
O0<x1<xa<..., limy,=f+g. 


By Theorem 1.13 and the additivity of the integral of (non-negative measurable) 
simple functions (cf. (7)), we have 


[b+ au = tim fxm du=tim [ (on + Yn) de 


stim féndu+tim fbndu=f tau+ f gan. 


The additivity property of the integral is also true for infinite sums of non- 
negative measurable functions: 


Theorem 1.16 (Beppo Levi). Let f, : X > [0,co],n = 1,2,..., be meas- 


urable. Then 
[rma dS | tatu 
n=1 n=1 
Proof. Let 
k love) 
9% = Yo fas 9 =D fn 
n=1 n=1 
The measurable functions g; satisfy 
O<g<g<..., limg,=g, 


and by Theorem 1.15 (and induction) 


[t= [uct 


1.3. Integration of non-negative measurable functions 15 


Therefore, by Theorem 1.13 


k lore) 
[on =Hp [ode tip Sf fotu= 3 f Jna 


We may extend now the measure property of v, defined earlier with a simple 
integrand, to the general case of a non-negative measurable integrand. 


Theorem 1.17. Let f : X — [0,00] be measurable, and set 


v(E) = f Faw EEA. 


Then v is a (positive) measure on A, and for any measurable g : X — (0, co], 


sae fot dn (*) 


Proof. Let E; € A,j =1,2,... be mutually disjoint, with union #. Then 
flz= )~ flux, 
j=l 


and therefore, by Theorem 1.16, 
V(E) := [fle du = > | fle, du = S > v(B))- 
J J 


Thus, v is a measure. 
If g = Ig for some E € A, then 


[ae = vie) = ftefdu= f ofa 


By (4) and Theorem 1.15 (for the measures jz and v), (*) is valid for g simple. 
Finally, for general g, the Approximation theorem (Theorem 1.8) provides a 
sequence of simple measurable functions 


O<disde2<---; limgn, =g. 
Then the measurable functions ¢,,f satisfy 


O<dif<¢dof<---; limdnf =f, 


16 1. Measures 


and Theorem 1.13 implies that 


fod iim fondo =tim f bntdu= fof dy 


Relation (*) is conveniently abbreviated as 


dv = f du. 


Observe that if f; and f2 coincide almost everywhere (briefly, “a.e.” or pu-a.e., 
if the measure needs to be specified), that is, if they coincide except on a null 
set A € A (more precisely, a pi-null set, that is, a measurable set A such that 
u(A) = 0), then the corresponding measures v; are equal, and in particular 
J fidu= f fod. Indeed, for all E € A, (EM A) = 0, and therefore 


ENA 


by one of the observations following Definition 1.12. Hence 
V1(E) = Yy(EN A) + VWy{(EN A‘) = VWy{(EN A‘) 
= Vo(E NM A‘) = V2(E). 


1.4 Integrable functions 


Let (X,A,) be a positive measure space, and let f be a measurable function 
with range in [—00,0o] or C := CU {oo} (the Riemann sphere). Then |f| : 
X — [0,00] is measurable, and has therefore an integral (€ [0,00]). In case this 
integral is finite, we shall say that f is integrable. In that case, the measurable 
set {|f| = oo] has measure zero. Indeed, it is contained in [|f| > n] for all 


n=1,2,..., and 


n|) = nd d dus. 
nu((|f| > n]) ie ws fit ws fis , 


Hence for all n ; 
0< p((Ifl =o) < | Iflde 


and since the integral on the right is finite, we must have p((|f| = co]) = 0. 

In other words, an integrable function is finite a.e. 

We observed previously that non-negative measurable functions that coincide 
a.e. have equal integrals. This property is desirable in the general case now 
considered. If f is measurable, and if we redefine it as the finite arbitrary constant 
con a set A € A of measure zero, then the new function g is also measurable. 
Indeed, for any open set V in the range space, 


Ig € V] = {lg VINA} U {lg € VIN A}. 


1.4. Integrable functions 17 


The second set on the right is empty if c € V°, and is A if c € V, thus belongs 
to A in any case. The first set on the right is equal to [f € V] NM A° € A, by the 
measurability of f. Thus [g € V] € A. 

If f is integrable, we can redefine it as an arbitrary finite constant on 
the set [|f| = co] (that has measure zero) and obtain a new finite-valued 
measurable function, whose integral should be the same as the integral of f (by 
the “desirable” property mentioned before). This discussion shows that we may 
restrict ourselves to complex (or, as a special case, to real) valued measurable 
functions. 


Definition 1.18. Let (X,.A, 4) be a positive measure space. The function f : 
X — C is integrable if it is measurable and 


fll: = f Fld < co. 


The set of all (complex) integrable functions will be denoted by 
L* (X, A, LH), 


or briefly by L'(u) or L1(X) or L', when the unmentioned “objects” of the 
measure space are understood. 

Defining the operations pointwise, L! is a complex vector space, since the 
inequality 


laf + Bgl < lal |fl + {6 gl 


implies, by monotonicity, additivity, and homogeneity of the integral of non- 
negative measurable functions: 


lof + Balla < lal [lflla + [GI Iiglla <9, 


for all f,g € Lt anda,8€C. 

In particular, || - ||; satisfies the triangle inequality (take a = 8 = 1), and is 
trivially homogeneous. 

Suppose || f||; = 0. For any n = 1,2,..., 


0< 1/n]) = du=n 1/n)d 
< pi((lf| > 1/n]) / roan! / pean) 
<n du<n =0, 

< Foul ren 


so 14([|f| > 1/n]) = 0. Now the set where f is not zero is 


Co 


[fl >] = UF > 1/n), 


n=1 


and by the o-subadditivity property of positive measures, it follows that this 
set has measure zero. Thus, the vanishing of ||f||1 implies that f = 0 ae. 


18 1. Measures 


(the converse is trivially true). One verifies easily that the relation “f = g a.e.” 
is an equivalence relation for complex measurable functions (transitivity follows 
from the fact that the union of two sets of measure zero has measure zero, by 
subadditivity of positive measures). All the functions f in the same equivalence 
class have the same value of || f||1 (cf. discussion following Theorem 1.17). 

We use the same notation L! for the space of all equivalence classes of 
integrable functions, with operations performed as usual on representatives of 
the classes, and with the || - ||1-norm of a class equal to the norm of any of its 
representatives; [1 is a normed space (for the norm || - ||1). It is customary, 
however, to think of the elements of Z' as functions (rather than equivalence 
classes of functions!). 

If f ¢ Lt, then f = ut+iv with u := Rf and v := Sf real measurable 
functions (cf. discussion following Lemma 1.7), and since |ul,|v| < |f|, we have 
\lulla, llolla < ||flla < oo, that is, u,v are real elements of L* (conversely, if u,v 
are real elements of L', then f = u+iv € L!, since L! is a complex vector 
space). 

Writing u = ut — u- (and similarly for v), we obtain four non-negative 
(finite) measurable functions (cf. remarks following Lemma 1.5), and since u* < 
|u| < |f| (and similarly for u~, etc.), they have finite integrals. It makes sense 


therefore to define 
[udu = furan fun ay 


(on the right, one has the difference of two finite non-negative real numbers!). 
Doing the same with v, we then let 


[tam [udu [van 


Note that according to this definition, 


wf rau fF dy 


and similarly for the imaginary part. 


Theorem 1.19. The map f > f{ f du €C is a continuous linear functional on 
the normed space L+(). 


Proof. Consider first real-valued functions f,g € L. Let h = f + g. Then 
h* —h- =(f* -f-)+ (9-9), 
and since all functions here have finite values, 
ht +f +g >h + ft+g". 


By Theorem 1.15, 


prraee [rant fordu= froaus f trans foray. 


1.4. Integrable functions 19 


All integrals above are finite, so we may subtract fh~ + [ f~ + f g~ from both 
sides of the equation. This yields: 


fram [taut fon 


The additivity of the integral extends trivially to complex functions in L?. 
If f € L’ is real and c € [0, 00), (cf)? =cf* and similarly for f~. Therefore, 
by (4) (following Definition 1.12), 


fetau= fer au fief au=ef prau—ef fodu=cf fay. 


If c € (—o0,0), (cf) = -—cf~ and (cf)~ = —cft, and a similar calculation 
shows again that [(cf) =cf f. For f € L* complex and c real, write f = u+iv. 
Then 


feta [cur ier) = few +i f(coy=ef fui for) Sele 


Note next that 


fon fosem=-fosi fons 


Finally, if c = a+ib (a, b real), then by additivity of the integral and the previous 
remarks, 


[icn= fersin= fans for=af psf raeft. 


Thus 
[lors sa)du=o f tans f gan 


for all f,g € Lt anda,8€C. 
For f € L', let \ := f fdu(e C). Then, since the left-hand side of the 
following equation is real, 


al=er=el” f pau = [Ce fdn=R [lef du= [ RCP) dy 
< fle*flaw= f Islay. 


We thus obtained the important inequality 


| [ra < f flay. (1) 


If f,g € L’, it follows from the linearity of the integral and (1) that 


| [ tau- fadnl =| fr-arau) <a (2) 


20 1. Measures 


In particular, if f and g represent the same equivalence class, then || f — g||1 = 0, 
and therefore [ f dw = { gd. This means that the functional f > f f dy is 
well-defined as a functional on the normed space L+(1) (of equivalence classes!), 
and its continuity follows trivially from (2). 


In term of sequences, continuity of the integral on the normed space L! means 
that if {f,} C L' converges to f in the L'-metric, then 


fades f fay. (3) 


A useful sufficient condition for convergence in the L'-metric, and therefore, 
for the validity of (3), is contained in the Dominated Convergence theorem of 
Lebesgue: 


Theorem 1.20. Let (X,A, 1) be a measure space. Let {fn} be a sequence of 
complex measurable functions on X that converge pointwise to the function f. 
Suppose there exists g € L1() (with values in [0,00)) such that 


lfml<g (n=1,2,...). (4) 
Then f, fn € L1(u) for all n, and fn + f in the L'()-metric. 
In particular, (3) is valid. 


Proof. By Lemma 1.5, f is measurable. By (4) and monotonicity 


lf lla, (fala < Ilglla < 0, 


so that f, fn € L'. 
Since | f, — f| < 2g, the measurable functions 2g — |f, — f| are non-negative. 
By Fatou’s Lemma (Theorem 1.14), 


[imintleg — [fo = £1) dues timint f (2g — [fa — Fl) a (5) 


The left-hand side of (5) is [2gdy. The integral on the right-hand side is 
f 2gdu+ (-||fn — fl1), and its lim inf is 


= f 2odq + timins(—l\fo ~ fll) = f 29 du —limsup If fll 
Subtracting the finite number { 2g dy from both sides of the inequality, we obtain 
lim sup [fn — flr <0. 


However, if a non-negative sequence {a,} satisfies limsupa, < 0, then it 
converges to 0 (because 0 < liminfa, < limsupa, < 0 implies liminfa, = 
lim sup a, = 0). Thus || fn — fla > 0. 


1.4. Integrable functions 21 


Rather than assuming pointwise convergence of the sequence {f,,} at every 
point of X, we may assume that the sequence converges almost everywhere, that 
is, fn > f ona set E € A and p(E°) = 0. The functions f,, could be defined 
only a.e., and we could include the countable union of all the sets where these 
functions are not defined (which is a set of measure zero, by the o-subadditivity 
of measures) in the “exceptional” set E°. The limit function f is defined a.e., in 
any case. For such a function, measurability means that [f € V]M E € A for 
each open set V. 

If f, (defined a.e.) converge pointwise a.e. to f, then with F as mentioned, 
the restrictions f,|~ are Ag-measurable, where Ag is the o-algebra AN E, 
because 

[fnle © V] = [fn EC VINE € Ag. 


By Lemma 1.5, f|z := lim f,|¢ is Ag-measurable, and therefore the a.e.-defined 
function f is “measurable” in the above sense. We may define f as an arbitrary 
constant c € C on E*; the function thus extended to X is A-measurable, as seen 
by the argument preceding Definition 1.18. 

Now f,lg are A-measurable, converge pointwise everywhere to flz, and if 
|fn| < g € L* for all n (wherever the functions are defined), then |fnIz| < g € L' 
(everywhere!). By Theorem 1.20, 


lfn — fla = lfnte — fle|l1 > 0. 


We then have the following a.e. version of the Lebesgue Dominated Convergence 
theorem: 


Theorem 1.21. Let {f,} be a sequence of a.e.-defined measurable complex 
functions on X, converging a.e. to the function f. Let g € L* be such that 
lfnl < g for all n (at all points where f, is defined). Then f and fn are in L', 
and f, > f in the L'-metric (in particular, [ frdu— f f du). 


A useful “almost everywhere” proposition is the following: 


Proposition 1.22. If f € L'(u) satisfies J,, f du = 0 for every E € A, then 
f =0 ae. 


Proof. Let E = [u:=*f > 0]. Then E € A, so 


wt = f udu=a f 70, 
E E 


and therefore ut = 0 a.e. Similarly u~ = vt = v- = 0 ae. (where v := Sf), so 
that f =O ace. 


We should remark that, in general, a measurable a.e.-defined function f can 
be extended as a measurable function on X only by defining it as constant on 
the exceptional null set E'°. Indeed, the null set E° could have a non-measurable 
subset A. Suppose f : E — C is not onto, and let a € f(E)°. If we assign on 
A the (constant complex) value a, and any value b € f(£) on E° — A, then the 
extended function is not measurable, because [f = a] = A ¢ A. 


22 1. Measures 


In order to be able to extend f in an arbitrary fashion and always get 
a measurable function, it is sufficient that subsets of null sets should be 
measurable (recall that a “null set” is measurable by definition!). A measure 
space with this property is called a complete measure space. Indeed, let f’ be 
an arbitrary extension to X of an a.e.-defined measurable function f, defined on 
E € A, with E* null. Then for any open V C C, 


[ff eV] = (If e VIN BU (if €V]NE*). 


The first set in the union is in A, by measurability of the a.e.-defined function 
f; the second set is in A as a subset of the null set ES (by completeness of the 
measure space). Hence [f’ € V] € A, and f’ is measurable. 

We say that the measure space (X,M,v) is an extension of the measure 
space (X,A, y) (both on X!) if AC M and v = yon A. It is important to know 
that any measure space (X,A, 4) has a (unique) “minimal” complete extension 
(X,M,v), where minimality means that if (X,N,c) is any complete extension 
of (X, A, 1), then it is an extension of (X,M,v). Uniqueness is of course trivial. 
The existence is proved below by a “canonical” construction. 


Theorem 1.23. Any measure space (X,A,) has a unique minimal complete 
extension (X,M,v) (called the completion of the given measure space). 


Proof. We let M be the collection of all subsets E of X for which there exist 
A,B € A such that 
ACECB, p(B-—A)=0. (6) 


If E € A, we may take A = B = E in (6), so AC M. In particular X € M. 
If FE € M and A, B are as in (6), then A‘, BS € A, 


BoC ESC AS 


and u(A° — B°) = p(B — A) =0, so that ES € M. 
If £} € M,j =1,2,... and A,B; are as in (6) (for E;), then if E,.A, B are 
the respective unions of Ej, A;,B;, we have A,B ¢ A, AC EC B, and 


B-A=|J(B; - 4) c(B; - 4)). 


The union on the right is a null set (as a countable union of null sets, by 
o-subadditivity of measures), and therefore B— A is a null set (by monotonicity 
of measures). This shows that E € M, and we conclude that M is a o-algebra. 

For E € M and A,B as in (6), we let v(E) = (A). The function v is well 
defined on M, that is, the definition does not depend on the choice of A, B as 
in (6). Indeed, if A’, B’ satisfy (6) with E, then 


AaeaeneA ope a 


so that A — A’ is a null set. Hence by additivity of uw, u(A) = w(AN A’) + 
u(A — A’) = p(AN A’). Interchanging the roles of A and A’, we also have 
p( A") = u(AN A’), and therefore (A) = p(A’), as wanted. 


1.5. L?-spaces 23 


If EF € A, we could choose A = B = EF, and so v(E) = p(£). In particular, 
v(0) = 0. If {£;} is a sequence of mutually disjoint sets in M with union E, 
and A,,B; are as in (6) (for Ej), we observed above that we could choose A for 
E (for (6)) as the union of the sets A;. Since A; C Ej,j =1,2,... and E; are 
mutually disjoint, so are the sets A;. Hence 


V(E) == (A) = > H(A;) = D0 (Ej), 


J 


and we conclude that (X,M,v) is a measure space extending (X,.A, jz). It is 
complete, because if E € M is v-null and A,B are as in (6), then for any 
F CE, we have 

OCFCB, 


and since p(B — A) = 0, 


so that Fe M. 

Finally, suppose (X, AN, c) is any complete extension of (X,.A, ), let Ee M, 
and let A, B be as in (6). Write E = AU(E—A). The set B—A € AC Nis o-null 
(o(B — A) = p(B — A) = 0). By completeness of (X,NV,c), the subset EF — A of 
B-— A belongs to N (and is of course o-null). Since A € A C N, we conclude 
that E € N and M CN. Also since o = p on A, o(F) = o(A) + 0(£ — A) = 
p(A) := v(E), so that o =v on M. 


1.5 L?-spaces 


Let (X,.A, 4) be a (positive) measure space, and let p € [1, 00). If f : X — [0,00] 
is measurable, so is f? by Lemma 1.6, and therefore [ f? dy € [0,00] is well 


defined. We denote 
1/p 
Isln=(f Pau) 


Theorem 1.24 (Holder’s inequality). Let p,q € (1,00) be conjugate 


exponents, that is, 
1 1 
= + = 
P 4 
Then for all measurable functions f,g :X — (0, cx], 


= (1) 


jf fd Ill lala (2) 


Proof. If ||f||, = 0, then || f?||1 = 0, and therefore f = 0 a.e.; hence fg = 0 
a.e., and the left-hand side of (2) vanishes (as well as the right-hand side). By 
symmetry, the same holds true if ||g||7 = 0. So we may consider only the case 
where || f||p and ||g||_ are both positive. Now if one of these quantities is infinite, 


24 1. Measures 


the right-hand side of (2) is infinite, and (2) is trivially true. So we may assume 
that both quantities belong to (0,00) (positive and finite). Denote 


u=f/|lfllp, v=9/Iglla- (3) 
Then 
Ilullp = llellg = 1. (4) 
It suffices to prove that 
[uw du <1, (5) 
because (2) would follow by substituting (3) in (5). 
The logarithmic function is concave ((logt)” = —(1/t?) < 0). Therefore, 


by (1) 
1 1 t 
—logs + — logt < log (: + ) 
p q P 4 


for all s,t € (0,00). Equivalently, 
t 
sl/Ppla c= 4", 5,t€ (0,00). (6) 
P @q 


When x € X is such that u(x),v(x) € (0,00), we substitute s = u(x)? and 
t = v(x)? in (6) and obtain 
u(a)P  u(a)? 


o. @? ) 


u(a)u(a) < 


and this inequality is trivially true when u(x), v(x) € {0,co}. Thus (7) is valid 
on X, and integrating the inequality over X, we obtain by (4) and (1) 


ull? vif 11 
fous! Ip a eit aa 
Pp qd Pp qd 


Theorem 1.25 (Minkowski’s inequality). For any measurable functions 
fig +X — [0, oo], 


If + gllp <llfllp + Ilgll> (<p <0). (8) 


Proof. Since (8) is trivial for p = 1 (by the additivity of the integral of non- 
negative measurable functions, we get even an equality), we consider p € (1,00). 
The case ||f + g||p = 0 is trivial. By convexity of the function ¢? (for p > 1), 
((s + t)/2)? < (s? + t?)/2 for s,t € (0,00). Therefore, if x € X is such that 
f(x), g(a) € (0,00), 


(f(x) + g(a))? < P-"[F (a)? + g(x)?], (9) 


1.5. L?-spaces 25 


and (9) is trivially true if f(x),g(%) € {0,co}, and holds therefore on X. 
Integrating, we obtain 


If + allp <2? FI8 + loll. (10) 


If || f+g||p = 0, it follows from (10) that at least one of the quantities || f||,, ||g||p 
is infinite, and (8) is then valid (as the trivial equality oo = oo). This discussion 
shows that we may restrict our attention to the case 


0<|lf + gllp < 00. (11) 


We write 
(f+9)? =f(f+g)P"+9(f +9). (12) 
By Holder’s inequality, 


jte ae a < lIflloll(f +9)? "Mla = Wfllollf + 9B”, 


since (p — 1)q = p for conjugate exponents p,q. A similar estimate holds for the 
integral of the second summand on the right-hand side of (12). Adding these 
estimates, we obtain 


IF +918 < (flo + lgllp IF + gb". 


By (11), we may divide this inequality by ||f + ll! 7” and (8) follows since 


p—p/q=1. 


In a manner analogous to that used for L', if p € [1, 00), we consider the set 
L?(X, A, p) 


(or briefly, L?(w), or L?(X), or L?, when the unmentioned parameters are 
understood) of all (equivalence classes) of measurable complex functions f on 
X, with 


IIfllp = WIFI llp < 20. 
Since || - ||, is trivially homogeneous, it follows from (8) that L? is a normed 
space (over C) for the pointwise operations and the norm || - ||,. We can restate 


Holder’s inequality in the form: 


Theorem 1.26. Let p,q € (1,00) be conjugate exponents. If f € L? andg € L4, 
then fg € L', and 


IIfglla < WFlle liga. 


A sufficient condition for convergence in the L?-metric follows at once from 
Theorem 1.21: 


Proposition. Let {f,} be a sequence of a.e.-defined measurable complex 
functions on X, converging a.e. to the function f. For some p € [1, co), suppose 


26 1. Measures 


there exists g © L? such that |fn| <g for all n (with the usual equivalence class 
ambiguity). Then f, fn € L?, and fn > f in the L?-metric. 


Proof. The first statement follows from the inequalities |f|?,|fn|? < g? € L?. 
Since | f — fn|? + Oa.e. and | f — fn |? < (2g)? € L*, the second statement follows 
from Theorem 1.21. 


The positive measure space (X, A, 1) is said to be finite if u(X) < oo. When 
this is the case, the Holder inequality implies that L?(t) C L”(w) topologically 
(i.e., the inclusion map is continuous) when 1 < r < p < ov. Indeed, if f € L?(y), 
then by Holder’s inequality with the conjugate exponents p/r and s := p/(p—r), 


fll = / fl" dy 


<| fren au) | [val xy tale, 


Since 1/rs = (1/r) — (1/p), we obtain 
Ilflle < w(X)/"-*? || fp. (13) 


Hence, f € L" (ys), and (13) (with f — g replacing f) shows the continuity of the 
inclusion map of L? (1) into L" (py). 
Taking in particular r = 1, we get that L?(w) C L'() (topologically) for all 
p> 1, and 
Ill: < eX)™"Ilfllps (14) 


where q is the conjugate exponent of p. 
We formalize this discussion for future reference. 


Proposition. Let (X,A,,) be a finite positive measure space. Then L?() C 
L" (ut) (topologically) for 1 <r <p <oo, and the norms inequality (13) holds. 


Let (X,A) and (Y,8) be measurable spaces, and let h : X + Y bea 
measurable map (cf. Definition 1.2). If w is a measure on A, the function 
v : B - [0,00] given by 


v(E)=w(h"(E)), BEB (15) 


is well defined, and is clearly a measure on B. Since I;,-1(~) = Im oh, we can 


write (15) in the form 
i In dv= | Ip ohdu. 
Y x 


By linearity of the integral, it follows that 


[ f= [ tonan (16) 


for every B-measurable simple function f on Y. If f : Y 7 [0,c«] is 
B-measurable, use the Approximation Theorem 1.8 to obtain a non-decreasing 


1.5. L?-spaces 27 


sequence {f,} of B-measurable non-negative simple functions converging 
pointwise to f; then {f, oh} is a similar sequence converging to f oh, and 
the Monotone Convergence Theorem shows that (16) is true for all such f. 

If f : Y + Cis B-measurable, then foh is a (complex) A-measurable function 
on X, and for any 1<p<o, 


[isa itronde=f itonray. 
Y Xx x 
Thus, f € L?(v) for some p € [1, 00) if and only if foh € L?(p), and 


lf lize) = If ° Alle): 


In particular (case p = 1), f is v-integrable on Y if and only if foh is p-integrable 
on X. When this is the case, writing f as a linear combination of four non- 
negative v-integrable functions, we see that (16) is valid for all such f. 


Proposition. Let (X,A) and (Y,B) be measurable spaces, and leth: X — Y be 
a measurable map. For any (positive) measure ju on A, define v(E) := u(h~*(E)) 
for E € B. Then: 


(1) v is a (positive) measure on B; 

(2) if f :Y¥ — [0, co] is B-measurable, then f oh is A-measurable and (16) is 
valid; 

(3) if f :Y¥ 3 C is B-measurable, then f oh is A-measurable; f € L?(v) for 
some p € [1,00) if and only if foh € L?(u), and in that case, the map 
f — foh is norm-preserving; in the special case p = 1, the map is integral 
preserving (i.e. (16) is valid). 

If ¢ is a simple complex measurable function with distinct non-zero values c; 

assumed on £;, then 


Sle = S05 leu By) 
J 
is finite if and only if u(E;) < oo for all j, that is, equivalently, iff 


L([]9| > 0]) < co. 


Thus, the simple functions in L? (for any p € [1,co)) are the (measurable) simple 
functions vanishing outside a measurable set of finite measure (depending on the 
function). These functions are dense in L?. Indeed, if 0 < f € L” (without 
loss of generality, we assume that f is everywhere defined!), the Approximation 
Theorem provides a sequence of simple measurable functions 


ee ee ee 


such that ¢,, > f pointwise. By the proposition following Theorem 1.26, ¢, > f 
in the L?-metric. 

For f € L? complex, we may write f = Sy i*g, with 0 < g, € L 
(go := ut, etc., where u = Rf). We then obtain four sequences {¢,,~} of simple 


28 1. Measures 


functions in L? converging, respectively, to g,,k = 0,...,3, in the D?-metric; 
if bd, := ae ies then ¢, are simple L?-functions, and ¢, — f in the 
L?-metric. We proved 


Theorem 1.27. For any p € [1,00), the simple functions in L? are dense in L?. 


Actually, L? is the completion of the normed space of all measurable simple 
functions vanishing outside a set of finite measure, with respect to the L?-metric 
(induced by the L?-norm). The meaning of this statement is made clear by the 
following definition. 


Definition 1.28. Let Z be a metric space, with metric d. A Cauchy sequence 
in Z is a sequence {z,} C Z such that d(zn,zm) + 0 when n,m — oo. The 
space Z is complete if every Cauchy sequence in Z converges in Z. If Y C Z is 
dense in Z, and Z is complete, we also say that Z is the completion of Y (for the 
metric d). The completion of Y (for the metric d) is unique in a suitable sense. 


A complete normed space is called a Banach space. 
In order to get the conclusion preceding Definition 1.28, we still have to prove 
that L? is complete: 


Theorem 1.29. L? is a Banach space for each p € {1, 00). 
We first prove the following. 


Lemma 1.30. Let {f,} be a Cauchy sequence in L?(«). Then it has a 
subsequence converging pointwise [1-a.e. 


Proof of Lemma. Since {f,} is Cauchy, there exists m, © N such that 
fn — fmllp < 1/2* for all n > m > mx. Set 


Np = k+max(m1,...,mx). 


Then nz41 > Nz > mg, and therefore { f,, } is a subsequence of { f,,} satisfying 


ice Frelle < 1/2" k=1,2,... (17) 

Consider the series as 
9= So \fnnys — Fels (18) 

k=1 


and its partial sums g,,. By Theorem 1.25 and (17), 


lIgllp < Do Wfonss — Frallp < D5 1/2" =1 
k=1 k=1 
for all m. By Fatou’s lemma, 


|e duu < limint [of dp = liminf ||gm||> < 1. 


1.5. L?-spaces 29 
Therefore, g < 00 a.e., that is, the series (18) converges a.e., that is, the series 


Co 
fri + Se pena = Fins) (19) 

k=1 
converges absolutely pointwise a.e. to its sum f (extended as 0 on the null set 
where the series does not converge). Since the partial sums of (19) are precisely 
tinm, the lemma is proved. 


Proof of Theorem 1.29. Let {/;,} C L” be Cauchy. Thus for any € > 0, there 
exists n. € N such that 

Ilfn — fmllp <€ (20) 
for all n,m > n.. By the lemma, let then {fy,} be a subsequence converging 
pointwise a.e. to the (measurable) complex function f. Applying Fatou’s lemma 
to the non-negative measurable functions |f,, — fm|, we obtain 


If — folly = f Vina Mfg — fn? de < liming — fll SE? (21) 


for all m > n-. In particular, f — fm € L”, and therefore f = (f —fm)+ fm € L’, 
and (21) means that fm — f in the L?-metric. 


Definition 1.31. Let (X,A,) be a positive measure space, and let f: X + C 
be a measurable function. We say that M € [0,co] is an a.e. upper bound for 
|f| if |f| < M a.e. The infimum of all the a.e. upper bounds for |f| is called 
the essential supremum of |f|, and is denoted || f||... The set of all (equivalence 
classes of ) measurable complex functions f on X with ||f||,, < oo will be 
denoted by L™(y) (or L~(X), or L°(X,A, py), or L*, depending on which 
“parameter” we wish to stress, if at all). 


By definition of the essential supremum, we have 


If] <Ilflloo ace. (22) 


In particular, || ||. = 0 implies that f = 0 a.e. (that is, f is the zero class). 

If f,g € L, then by (22), |f +9] < |fl+l9l < Ilflloo + IIglloo a-e., and so 
If + glloo S IIflloo + IIglleo- 

The homogeneity ||af||oo = |a| || fll. is trivial if either a = 0 or ||f\lo = 
0. Assume then |a|,|| fll. > 0. For any t € (0,1), t\lflloo < |lflloo, hence it 
is not an a.e. upper bound for |f|, so that u([|f| > tl|fllo]) > 0, that is, 
laf] > tla flloc]) > 0. Therefore, llefllec > lal Ifllec for all t € (0,1), 
hence |laf||o. > Jal ||f ll. The reversed inequality follows trivially from (22), 
and the homogeneity of || - ||.o follows. We conclude that L° is a normed space 
(over C) for the pointwise operations and the L°-norm || - ||co- 

We verify its completeness as follows. Let {f,} be a Cauchy sequence in L°. 
In particular, it is a bounded set in L°. Let then K = sup,, || fnlloo- By (22), the 
sets Fi, := [| fr| > K] (k € N) and 


Enm := [fn — fml > |lfn — fmlloo] (n,m € N) 


30 1. Measures 


are p-null, so their (countable) union £ is null. For all x € E°, 


|fn(®) — fm(x)| < llfn — fmlloo 4 0 


as n,m — oo and |fn(x)| < K. By completeness of C, the limit f(x) := 
lim, fn(x) exists for all « € E° and |f(x)| < K. Defining f(x) = 0 for all x € E, 
we obtain a measurable function on X such that |f| < K, that is, f € L°. Given 
€ > 0, let n. € N be such that 


Il fn a fim|loo <e (n,m > Ne). 


Since |fn(@) — fim (a 
obtain | fn(a) — f(x) 


)| < € for all x € ES and n,m > n¢, letting m > oo, we 
| < for all a € E° and n > ng, and since u(E) = 0, 


Ilfn — fll S€ (N> ne), 
that is, f, + f in the L°-metric. We proved 
Theorem 1.32. L°® is a Banach space. 


Defining the conjugate exponent of p = 1 to be gq = o (so that 
(1/p) + (1/q) =1 is formally valid in the usual sense), Holder’s inequality 
remains true for this pair of conjugate exponents. Indeed, if f € L+ and g € L™, 
then |fg| < ||glloo |f| a-e., and therefore fg € L1 and 


II Falla < [Igllooll lla. 


Formally 


Theorem 1.33. Holder’s inequality (Theorem 1.26) is valid for conjugate 
exponents p,q € [1, co]. 


1.6 Inner product 


For the conjugate pair (p,q) = (2,2), Theorem 1.26 asserts that if f,g € L?, 
then the product fg is integrable, so we may define 


(f,9) = i fod (1) 


(g denotes here the complex conjugate of g). The function (or form) (-,-) has 
obviously the following properties on L? x L?: 


(i) (f, f) => 0, and (f, f) = 0 if and only if f =0 (the zero element); 
(ii) (-,g) is linear for each given g € L?; 


(iii) (9, f) = (fg). 


1.6. Inner product 3l 


Property (i) is called positive definiteness of the form (-,-); Properties (ii) 
and (iii) (together) are referred to as sesquilinearity or Hermitianity of the form. 
We may also consider the weaker condition 


@)(F,f) = 0 for all f, 


called (positive) semi-definiteness of the form. 


Definition 1.34. Let X be a complex vector space (with elements z,y,...). 
A (semi)-inner product on X is a (semi)-definite sesquilinear form (-,-) on X. 
The space X with a given (semi)-inner product is called a (semi)-inner product 
space. 


If X is a semi-inner product space, the non-negative square root of (a,x) is 
denoted |||]. 

Thus L? is an inner product space for the inner product (1) and ||f|| := 
(f, f)!/? = ||fllz. By Theorem 1.26 with p = q = 2, 


(F,9)1 < I flle Ilglle (2) 


for all f,g € L?. This special case of the Holder inequality is called the Cauchy- 
Schwarz inequality. We demonstrate below that it is valid in any semi-inner 
product space. 

Observe that any sesquilinear form (-,-) is conjugate linear with respect to 
its second variable, that is, for each given x € X, 


(xz, au + Bv) = a(a,u) + B(z,v) (3) 


for alla, @ € Cand u,v E€ X. 
In particular 


(x,0) = (0,y) =0 (4) 
for all x,y € X. 
By (ii) and (3), for all Ac Cand z,y € X, 
(a + Ay, 2 + Ay) = (wx) + A(x, y) + Ay, 2) + |AP YY). 


Since A(y, x) is the conjugate of (x,y) by (iii), we obtain the identity (for all 
AX€C and z,y€ X) 


lx + Ayl|? = [lal]? + 2RA(w, y)] + [AP llyll?. (5) 
In particular, for \ = 1 and A = —1, we have the identities 
lle + yll? = |lall? + 2R(a,y) + Ilyll? (6) 
and 
\lz — yll? = llal|? — 2R(@, y) + lly’. (7) 


Adding, we obtain the so-called parallelogram identity for any s.i-p. (semi-inner 
product): 
lle + yll? + fla — yl? = lial? + 2llyll?. (8) 


32 1. Measures 


Subtracting (7) from (6), we obtain 
AR (x,y) = |lx + yl|? — lle — ll’. (9) 
If we replace y by zy in (9), we obtain 
S(x,y) = 48[-i(a, y)] = 4R(a, iy) = lla + iy]? — || — iyl|?. (10) 


By (9) and (10), 
3 


(0,9) = 5 oi lle + YP, (1) 


k=0 


where i = /—1. This is the so-called polarization identity (which expresses the 
s.i.p. in terms of “induced norms”). 
By (5), 
0 < |la||? + 2R[A(w, y)] + [AP llyll? (12) 


for all A € C and 2,y € X. If |ly| > 0, take A = —(x,y)/|ly||?; then 
\(x,y)P/lly|? < \|a|/?, and therefore 


(x,y) S lla lal (13) 


If ||y|| = 0 but ||2|| > 0, interchange the roles of x and y and use (iii) to reach the 
same conclusion. If both |||] and ||y|| vanish, take A = —(a,y) in (12): we get 
0 < —2|(a, y)|?, hence |(x, y)| = 0 = ||z'|| ||y|], and we conclude that (13) is valid 
for all x,y € X. This is the general Cauchy—Schwarz inequality for semi-inner 
products. 

By (6) and (13), 


llc + yll? < lel? + 2I(x, y)| + llyll? < lell? + lal [lel] + yl? = (ell + Ilyll)?, 


hence 


Ila + yll S lel] + IlyIl 


for all z,y € X. Taking x = 0 in (5), we get ||Ay|| = |A| ||y|| for all A © C and 
y € X. We conclude that ||; || is a semi-norm on X; it is a norm iff the s.i-p. is 
an inner product, that is, iff it is definite. Thus, an inner product space X is a 
normed space for the norm ||a'|| := (2, )!/? induced by its inner product (unless 
stated otherwise, this will be the standard norm for such spaces). In case X is 
complete, it is called a Hilbert space. Thus Hilbert spaces are special cases of 
Banach spaces. 

The norm induced by the inner product (1) on L? is the usual L?-norm || - |l2, 
so that, by Theorem 1.29, L? is a Hilbert space. 


1.7. Hilbert space: a first look 33 


1.7 Hilbert space: a first look 


We consider some “geometric” properties of Hilbert spaces. 


Theorem 1.35 (Distance theorem). Let X be a Hilbert space, and let K C X 
be non-empty, closed, and convex (i.e., (c+ y)/2 € K whenever x,y € K). Then 
for each « € X, there exists a unique k € Kk such that 


d(x, k) = d(x, K). (1) 


The notation d(x, y) is used for the metric induced by the norm, d(x, y) := 

la — y||. As in any metric space, d(x, K) denotes the distance from x to K, 
that is, 

d(x, K) := inf d : 2 

(eS nedy) (2) 


Proof. Let d = d(x, K). Since d* = infyex ||x — y||?, there exist yn € K such 
that 


(d? <)||x — yn||? <d?+1/n, n=1,2,.... (3) 


By the parallelogram identity, 


lly — Yl? = |I(@ — Ym) — (@ = Yn)? 
I. 


= 2|la — yml|? + 2|!2 — yall? = |(@ — ym) + (@ — yn) 

Rewrite the last term on the right-hand side in the form 

Al|z — (Ym + yn)/2l|? = 4d?, 
since (Ym + Yn)/2 € K, by hypothesis. Hence by (3) 

[Yn — Ym||? < 2/m + 2/n > 0 
as m,n — oo. Thus, the sequence {y,} is Cauchy. Since X is complete, the 
sequence converges in X, and its limit & in necessarily in kK because y, € K for 
all n and K is closed. By continuity of the norm on X, letting n — co in (3), we 
obtain ||a — k|| = d, as wanted. 

To prove uniqueness, suppose k, k’ € K satisfy 
||z — kl] = |lz — k'|| =. 

Again by the parallelogram identity, 


[lk — k'||? = ||(e@ — k’) — (& — &) |? 
= 2||x — k'||? + Qlla — kl? — ||(w@ — k’) + @— k)I)?. 


34 1. Measures 


As before, write the last term as 4||z — (k + k’)/2||? > 4d? (since (k +k’)/2€ K 
by hypothesis). Hence 


||k — k’ ||? < 2d? + 2a? — 4d? = 0, 


and therefore k = k’. 


We say that the vector y € X is orthogonal to the vector «x if (a,y) = 0. In 
that case also (y, x) = (x, y) = 0, so that the orthogonality relation is symmetric. 
For x given, let 2+ denote the set of all vectors orthogonal to x. This is the kernel 
of the linear functional ¢ = (-,x), that is, the set d~1({0}). As such a kernel, 
it is a subspace. Since |¢(y) — o(z)| = |(y — z,x)| < lly — 2] ||a|| by Schwarz’s 
inequality, ¢ is continuous, and therefore + = ¢~1!({0}) is closed. Thus, z+ is 
a closed subspace. More generally, for any non-empty subset A of X, define 


Atc= () at = {ye Y;(y,x) =0 for all xz € A}. 
rEA 


As the intersection of closed subspaces, A+ is a closed subspace of X. 


Theorem 1.36 (Orthogonal decomposition theorem). Let Y be a closed 
subspace of the Hilbert space X. Then X is the direct sum of Y and Y+, that is, 
each x € X has the unique orthogonal decomposition x =y+z with y © Y and 
gpeYy. 


Note that the so-called components y and z of x (in Y and Y+, respectively) 
are orthogonal. 


Proof. As a closed subspace of X, Y is a non-empty, closed, convex subset of 
X. By the distance theorem, there exists a unique y € Y such that 


la — y|| = d:= d(a,Y). 


Letting z := x — y, the existence part of the theorem will follow if we show that 
(z,u) = 0 for all u € Y. Since Y is a subspace, and Y 4 {0} without loss of 
generality, every u € Y is a scalar multiple of a unit vector in Y, so it suffices to 
prove that (z,u) = 0 for unit vectors u € Y. For all X € C, by the identity (5) 
(following Definition 1.34), 


lz — Aull? = llzll? — 2RA(z, w)] + Al? 


The left-hand side is 
IIa — (y+ Au)|/? > a’, 


since y+ Au € Y. Since ||z|| = d, we obtain 
0 < —2R[A(z, u)] + JAP. 


Choose A = (z,u). Then 0 < —|(z,u)|?, so that (z,u) = 0 as claimed. 


1.8. The Lebesgue—Radon—Nikodym theorem 35 


Ife=y+z2=y' +2’ are two decompositions with y,y’ € Y and z,z’ € Y~, 
then y—y! = 2'—z€YNY+, so that in particular y— y’ is orthogonal to itself 
(ie., (y—y',y — y’) = 0), which implies that y — y’ = 0, whence y = y’ and 

/ 
22) 


We observed in passing that for each given y € X, the function ¢ := (-,y) is 
a continuous linear functional on the inner product space X. For Hilbert spaces, 
this is the general form of continuous linear functionals: 


Theorem 1.37 (“Little” Riesz representation theorem). Let 6: X > C 
be a continuous linear functional on the Hilbert space X. Then there exists a 
unique y © X such that ¢ = (-,y). 


Proof. If ¢ = 0 (the zero functional), take y = 0. Assume then that ¢ 4 0, 
so that its kernel Y is a closed subspace # X. Therefore Yt #4 {0}, by 
Theorem 1.36. Let then z € Y+ be a unit vector. Since YN Y+ = {0}, 2 ZY, 
so that ¢(z) 4 0. For any given x € X, we may then define 


eg 9) 

, (2) 
By linearity, 

_ 9) 55) 
(u) = O(a) - FE 6(2) =0, 

that is, we Y, and (2) 

=u ae 

7= 4+ GG) o 


is the (unique) orthogonal decomposition of x (corresponding to the particular 
subspace Y, the kernel of ¢). Define now y = ¢(z)z(€ Y+). By (4), 


(x) 
o(z) 


since (u,y) = 0 and ||z|| = 1. This proves the existence part of the theorem. 
Suppose now that y,y’ € X are such that ¢(7) = (x,y) = (a, y’) for all x € X. 
Then (2,y — y’) = 0 for all x, hence in particular (y — y’,y — y’) = 0, which 
implies that y = y’. 


(2,4) = (uy) + > (z)(2, 2) = (2) 


1.8 The Lebesgue—Radon—Nikodym theorem 


We apply the Riesz representation theorem to prove the Lebesgue decomposition 
theorem and the Radon—Nikodym theorem for (positive) measures. 

We start with a measure-theoretic lemma. 

The positive measure space (X, A, 1) is o-finite if there exists a sequence of 
mutually disjoint measurable sets X; with union X, such that u(X;) < oo for 
all 7. 


36 1. Measures 


Lemma 1.38 (The averages lemma). Let (X,A,c) be a o-finite positive 
measure space. Let g € L*(c) be such that, for all E € A with 0 < o(E) < ~, 


the “averages” 
1 
Ag(g) := =a | gdo 
BE Ig 


are contained in some given closed set F C C. Then g(x) € F o-a.e. 


Proof. We need to prove that g~'(F°) is o-null. Write the open set F° as the 
countable union of the closed discs 


An := {2 € C3 |z-an| < rn}, n=1,2,.... 
Then 
Co 
g7(F°) = LJ 9M An); 
n=1 
and it suffices to prove that E, := g~'(A) is o-null whenever A is a closed disc 
(with center a and radius r) contained in F°. 
Write X as the countable union of mutually disjoint measurable sets X;, with 


o(Xx) < oo. Set Ea, = EanXx, and suppose o(E£,,,) > 0 for some A as above 
and some k. Since |g(x) — a] < r on E := Ey, and 0 < o(E) < oo, we have 


JAn(@) — al = |Ac(o- a) < Tam | |g— alae <r. 


so that Ag(g) € A C F*, contradicting the hypothesis. Hence o(E,a,,) = 0 for 
all k and therefore o(£,) = 0 for all A as above. 


Lemma 1.39. Let 0 < \ <0 be finite measures on the measurable space (X, A). 
Then there exists a measurable function g : X — [0,1] such that 


[ras | toa (1) 


Proof. By Definition 1.12, the relation \ < o between positive measures implies 
that ff d\ < J f do for all non-negative measurable functions f. Hence L?(a) C 
L?(\)(c L+(A), by the second proposition following Theorem 1.26.) 

For all f € L?(c), we have then by Schwarz’s inequality: 


for all f € L?(o). 


[fra]s fitters [tide sox)" 
Replacing f by f —h (with f,h € L?(c)), we get 


[fra = fray = fo : nyaa) < o(X)"If — hla) 


1.8. The Lebesgue—Radon—Nikodym theorem 37 


so that the functional f — f f dd is a continuous linear functional on L?(c). 
By the Riesz representation theorem for the Hilbert space L?(c), there exists an 
element g; € L?(c) such that this functional is (-,g1). Letting g = 9% (€ L7(c)), 
we get the wanted relation (1). 

Since Ig € L?(c) (because o is a finite measure), we have in particular 


\B)= ftear= [aa 


for all E € A. If o(E) > 0, 


=a fade = SH € (01 


By the Averages Lemma 1.38, g(x) € [0,1] o-a.e., and we may then choose a 
representative of the equivalence class g with range in [0, 1]. 


Terminology. Let (X,A,A) be a positive measure space. We say that the set 
A€ A carries the measure X (or that A is supported by A) if \(£) = A(EM A) 
for all FE € A. 

This is, of course, equivalent to \(£) = 0 for all measurable subsets E of A°. 

Two (positive) measures 1, A2 on (X,A) are mutually singular (notation 
Ai L Ag) if they are carried by disjoint measurable sets A1, Az. Equivalently, 
each measure is carried by a null set relative to the other measure. 

On the other hand, if A2(£) = 0 whenever \1(£) = 0 (for E € A), we say 
that A2 is absolutely continuous with respect to 1 (notation: Ag < A1). 

Equivalently, 42 < A, if and only if any (measurable) set that carries \, also 
carries Ao. 


Theorem 1.40 (Lebesgue-Radon—Nikodym). Let (X,A,,) be a o-finite 
positive measure space, and let be a finite positive measure on (X,A). Then 


(a) X has the unique (so-called) Lebesgue decomposition 
A=AatAs 


with Aa K p and XA, L p; 
(b) there exists a unique h € L'() such that 


Aa(E) = f hay 


forallE eA. 


(part (a) is the Lebesgue decomposition theorem; part (b) is the Radon— 
Nikodym theorem.) 


38 1. Measures 


Proof. Case p(X) < oo. 

Let o := + pw. Then the finite positive measures A,o satisfy \ < 0, so that 
by Lemma 1.39, there exists a measurable function g : X — [0,1] such that (1) 
holds, that is, after rearrangement, 


[ra-9 dn =f tod (2) 
for all f € L?(c). Define 


A= 9 (0;1))e BS gel 


Then A, B are disjoint measurable sets with union X. 
Taking f = Ip (€ L?(c), since o is a finite measure) in (2), we obtain 
p(B) = 0 (since g = 1 on B). Therefore, the measure , defined on A by 


\,(E) := MEN B) 


satisfies A, L pu. 
Define similarly A,(£) := \(E A); this is a positive measure on A, mutually 
singular with \, (since it is carried by A = BS), and by additivity of measures, 


ME) = MEN A) + (ENB) = \a(E) +, (E), 


so that the Lebesgue decomposition will follow if we show that Ag < p. This 
follows trivially from the integral representation (b), which we proceed to prove. 
For each n € N and EF € A, take in (2) 


f=fri=U+gt+-:-+9")In. 


(Since 0 < g < 1, f is a bounded measurable function, hence f € L?(c).) We 
obtain 


fang asf teeta") dp (3) 
E E 


Since g = 1 on B, the left-hand side equals f,,.,(1 — g"*t!) dX. However, 
0 < g < 1 on A, so that the integrands form a non-decreasing sequence 
of non-negative measurable functions converging pointwise to 1. By the 
monotone convergence theorem, the left-hand side of (3) converges therefore to 
MEM A) = ».(£). The integrands on the right-hand side of (3) form a non- 
decreasing sequence of non-negative measurable functions converging pointwise 
to the (measurable) function 

Co 

h:= os g”. 
n=1 


Again, by monotone convergence, the right-hand side of (3) converges to [ phdu, 
and the representation (b) follows. Taking in particular EF = X, we get 


lAllexGy = I h dy = a(X) = AA) < 00, 


1.8. The Lebesgue—Radon—Nikodym theorem 39 


so that h € L*(u), and the existence part of the theorem is proved in case 
p(X) < oo. 

General case. Let X; € A be mutually disjoint, with union X, such that 
0 < u(X;) < oo. Define 


1 
= ———Ix,. 
2 2u(X;) * 


This is a strictly positive p-integrable function, with ||w||; = 1. Consider the 


positive measure 
vV(E) = | wd. 
E 


Then v(X) = ||w|]4 = 1, and v < yp. On the other hand, if v(#) = 0, then 
(1/27 u(X;))H(E 9 Xj) = 0, hence p(EM Xj) = 0 for all j, and therefore 
u(E£) = 0. This shows that uw < v as well (one says that the measures 4 and v 
are mutually absolutely continuous, or equivalent). 

Since v is a finite measure, the first part of the proof gives the decomposition 
N= XatAzs with Aq < v (hence A, < p by the trivial transitivity of the relation 
<), and A, | v (hence », L ys, because A, is supported by a v-null set, which is 
also p-null, since u < v). The first part of the proof gives also the representation 


(cf. Theorem 1.17) 
r(E) =f ndv= f twdy = f Adu, 
E E E 


where h := hw is non-negative, measurable, and 
Alla = | h dp = da(X) < A(X) < ov. 
x 


This completes the proof of the “existence part” of the theorem in the general 
case. 
To prove the uniqueness of the Lebesgue decomposition, suppose 


A=AgtAs=AV+Ass 


with 
Nad, Kp and As, A, Lp. 


Let B be a p-null set that carries both A, and X,. Then 
Ag (BH ALB) = 0 atid” AB) SAB y= 0, 
so that for all E € A, 


Na(E) = g(EN BS) = MEN B°) 
= N, (EN B’) = X,(E), 


hence also \,(E) = X, 


Ss 


(E). 


40 1. Measures 
In order to prove the uniqueness of h in (b), suppose h, h’ € L*() satisfy 


Aa(B)= fl hdu= fi n'a 


Then h—h’ € L'() satisfies [,,(h—h’) du = 0 for all E € A, and it follows from 
Proposition 1.22 that h—h’ = 0 pra.e., that is, h = h’ as elements of L*(). 


If the measure is absolutely continuous with respect to py, it has the trivial 
Lebesgue decomposition A = A +0, with the zero measure as singular part. By 
uniqueness, it follows that A, = A, and therefore Part 2 of the theorem gives 
the representation \(E) = f,, hd for all E € A. Conversely, such an integral 
representation of A implies trivially that \ < w (if u(Z) = 0, the function 
hIz = 0 pra.e., and therefore \(E) = { fIz du = 0). Thus 


Theorem 1.41 (Radon—Nikodym). Let (X,A,) be a o-finite positive 
measure space. A finite positive measure A on A is absolutely continuous with 
respect to 1 if and only if there exists h € L1() such that 


NE) = ff hd (Ee A). (*) 


By Theorem 1.17, Relation (*) implies that 


[ara [aha (“*) 


for all non-negative measurable functions g on X. Since we may take g = Ip 
in (**), this last relation implies (*). As mentioned after Theorem 1.17, these 
equivalent relations are symbolically written in the form dA = hduw. It follows 
easily from Theorem 1.17 that, in that case, if g € L+(A), then gh € L*() 
and (**) is valid for such (complex) functions g. The function h is called the 
Radon-Nikodym derivative of A with respect to yz, and is denoted dX/du. 


1.9 Complex measures 


Definition 1.42. Let (X,A) be an arbitrary measurable space. A complex 
measure on A is a o-additive function pp: A —- C, that is, 


“( Uz) = Lae) (1) 


for any sequence of mutually disjoint sets BE, € A. 


Since the left-hand side of (1) is independent of the order of the sets E, and 
is a complex number, the right-hand side converges in C unconditionally, hence 
absolutely. Taking E, = @ for all n, the convergence of (1) shows that (0) = 0. 
It follows that y is (finitely) additive, and since its values are complex numbers, 


1.9. Complex measures 41 


it is “subtractive” as well (i.e., w(& — F) = w(E) — u(f’) whenever £,F € A, 
F CE). 

A partition of EF € A is a sequence of mutually disjoint sets A, € A with 
union equal to EF. We set 


|Z) := sup S> |H(As), (2) 
k 


where the supremum is taken over all partitions of EF. 


Theorem 1.43. Let ~ be a complex measure on A, and define |u| by (2). Then 
|u| is a finite positive measure on A that dominates p (i.e., |u(E)| < |u|(E) for 
all E € A). 


Proof. Let E = E, with E, € A mutually disjoint (n € N). For any partition 
{Ax} of E, {Ap N En}, is a partition of E, (n = 1,2,...), so that 


So |u(AeO En)| < |u|(En), 2 =1,2,.... 
k 


We sum these inequalities over all n, interchange the order of summation in 
the double sum (of non-negative terms!), and use the triangle inequality to 
obtain 


So lel(En) = SO) SS w(Ae 0 En)| = S59 |w(Ae) |, 
n k n k 


since {A,NE,,}» is a partition of A,, for each k € N. Taking now the supremum 
over all partitions {A;,} of E, it follows that 


Yo lHl(En) = |H\(Z). (3) 


On the other hand, given « > 0, there exists a partition {Ay}, of E, such 
that 


S> |u(Anw)| > ||(En) — €/2", n=1,2,.... 
k 
Since {An x}n,, is a partition of E, we obtain 


eB) > So lune) > So lel (En) - € 
n,k n 


Letting « + 0+ and using (3), we conclude that |u| is c-additive. Since |j.|(@) = 0 
is trivial, || is indeed a positive measure on A. 


In order to show that the measure |;u| is finite, we need the following. 


42 1. Measures 


Lemma. Let F CC be a finite set. Then it contains a subset E such that 
| So al > So [z|/4v2. 
z€EB z€F 


Proof of lemma. Let S be the sector 
Sateore rT 206s a/4: 


For z € S, Rz = |z|cos@ > |z|\cosm/4 = |z|//2. Similarly, if z € —S, then 
—z € S, so that —Rz = R(—z) > | — 2|/V2 = |z|/V2. If z € iS, then —iz € S, 
so that Sz = R(—-iz) > | —iz|/V/2 = |z|/V2. Similarly, if z € —iS, one obtains 
as before —Sz > |z|//2. Denote 


Ge lel; ak = S- \z|, &=0,1,2,3. 


2€F z€FN(i*S) 


Since C = Cees i*S, we have ae a, > a, and therefore there exists k € 
{0,1,2,3} such that a, > a/4. Fix such a k and define E = F'n (i*S). 
In case k = 0, that is, in case ag > a/4, we have 


[So zJ2RSo z= SOR 


z€E z€EB zE FNS 
Sy lel/V2 = ao/V2 > a/4V2. 
zE€FNS 


Similarly, in case k = 2, replacing R by —R and S by —S, the same inequality 
is obtained. In cases k = 1(k = 3), we replace # and S by S (—S) and iS (—iS) 
respectively, in the calculation. In all cases, we obtain | >.<, 2| > a/4V2, as 
wanted. 


Returning to the proof of the finiteness of the measure |u|, suppose 
\41|(A) = 00 for some A € A. Then there exists a partition {A;} of A such 
that >>, |u(A;)| > 4V2(1 + |u(A)|), and therefore there exists n such that 


Yaa :)| > 4V2(1 + |n(A))). 


Take in the lemma 
F = {y(A;); «=1,...,n}, 
let the corresponding subset FE be associated with the set of indices J C 
{1,...,n}, and define 
B:=|J A; (c A). 
ied 
Then 


B)| =| (Ad) 


tEd- 


> Yaa ))|/4V2 > 1+ |n(A)]- 


1.9. Complex measures 43 


If C := A— B, then 


[u(C)| = [u(A) — (B)| > |u(B)| — [H(A)] > 1. 


Also co = |p\(A) = |u|(B) + |u|(C) (since |p| is a measure), so one of 
the summands at least is infinite, and for both subsets, the p-measure has 
modulus >1. 

Thus, we proved that any A € A with |ju|(A) = © is the disjoint union of 
subsets By,C1 € A, with |u|(B1) = co and |u(C))| > 1. 

Since |u|(B1) = oo, By, is the disjoint union of subsets Bo,C2 € A, 
with |u|(By) = oo and |u(C2)| > 1. Continuing, we obtain two sequences 
{Bn},{Cn} C A with the following properties: 

By = Bn4iUCn41 (disjoint union); 
W\(Bp) = 00; (Ca) > 1; 2 = 1,2... 
For 7 > 7 > 1, since Bn41 C By for all n, we have Ci NC; C Bi-1 NC; C By N 
C; =, so C :=U,, Cn is a disjoint union. Hence the series }>,, (C,) = u(C) € 
C converges. In particular u(C,,) > 0, contradicting the fact that |u(C,,)| > 1 
for alln =1,2,.... 

Finally, since {F,0,0,...} is a partition of E € A, the inequality |u(F)| < 

|\4|(Z7) follows from (2). 


Definition 1.44. The finite positive measure |j:| is called the total variation 
measure of j4, and ||fu|| := |u|(X) is called the total variation of w. 


Let M(X,.A) denote the complex vector space of all complex measures on A 
(with the “natural” operations (u+v)(F) = u(F)4+v(E) and (cu)(£) = cu(E), 
for p,v € M(X, A) and c € C). With the total variation norm, M(X,.A) is a 
normed space. We verify below its completeness. 


Proposition. M(X,<A) is a Banach space. 
Proof. Let {jun} C M := M(X, A) be Cauchy. For all EF € A, 
[Hn (EB) — Um(E)| < [len — Hm|| + 0 
as n,m — oo, so that 
M(B) := lim pp (FE) 


exists. Clearly yz is additive. Let E be the union of the mutually disjoint sets 
Ex € A, k=1,2,..., let Ay =U;_, Ep, and let € > 0. Let no € R be such that 


I[-n — Hml| <€ (*) 


for all n,m > no. We have 
N 
|(E) — 30 (Ex) = |u(E) — w(Aw)| = |e(E - Aw)| 
k=1 


< |(u— Un)(E — An)| + |Un(E — An)]- 
Since 
|(un — Um) (BE — An)| < |[Hn — Hl < € 


44 1. Measures 


for all n,m > no, letting m — oo, we obtain |(u, — “)(E — An)| < € for all 
n > ng and all N EN. Therefore, 


for alln > np and N EN. Fix n > no and let N — ov. Since py, € M, we obtain 


Se 


N 
w(B) ~~ w(Ex) 


lim sup 
N k=l 


so that u(E) =  P., w(Ex). Thus we M. 
Finally, we show that ||“ — u»|| > 0. Let {£;,} be a partition of X. By (*), 
for all N EN and n,m > ng, 


N 
S- (Lm — Mn) (Ex)| < l|Lem = Ln <e€. 


> 
Il 
ma 


Letting m — oo, we get 
N 
Sel (4 — Un) (Ex)| < € 
k=1 


for all N and n > no. Letting N — oo, we obtain S77, |( — Un)(Ex)| < € for 
all n > no, and since the partition was arbitrary, it follows that || — || < € for 
n> No. 


If w € M(X, A) has real range, it is called a real measure. For example, 
Ru (defined by (RKu)(L) = K[u(L)]) and Sw are real measures for any complex 
measure pi, and p = Ru + iSpy. 

If « is a real measure, since |u(F)| < |~|(£) for all E € A, the measures 


ph =(1/2)(ul + a)s po = (1/2)(Iel — ) 


are finite positive measures, called the positive and negative variation measures, 
respectively. 
Clearly 


p= et — (4) 
and 
Jul = pr tp. 


Representation (4) of a real measure as the difference of two finite positive 
measures is called the Jordan decomposition of the real measure [L. 


1.9. Complex measures 45 


For a complex measure A, write first X = vy + io with v := RA and ag = BA; 
then write the Jordan decompositions of v and a. It then follows that any 
complex measure can be written as the linear combination 


3 
X= (5) 
k=0 


of four finite positive measures Ax. 

If uw is a positive measure and X is a complex measure (both on the 
o-algebra A), we say that is absolutely continuous with respect to ~ (notation: 
\ < ps) if A(E) = 0 whenever p(E) = 0, E € A; X is carried (or supported) by 
the set A € A if A(E) = 0 for all measurable subsets E of A‘; \ is singular with 
respect to yu if it is carried by a p-null set. Two complex measures 1, Ag are 
mutually singular (notation A; L 2) if they are carried by disjoint measurable 
sets. 

It follows immediately from (5) that Theorems 1.40 and 1.41 extend verbatim 
to the case of a complex measure . This is stated formally for future reference. 


Theorem 1.45. Let (X,A, pu) be a o-finite positive measure space, and let X be 
a complex measure on A. Then 


(1) X has a unique Lebesgue decomposition 
A=AagtAs 


with Au << wand A, L py; 
(2) there exists a unique h € L'() such that 


Aa(E) = f hay (Be A); 


(3) \< p iff there exists h € L'() such that \(E) = [,,hdy for all E € A. 


Another useful representation of a complex measure in terms of a finite 
positive measure is the following: 


Theorem 1.46. Let uw be a complex measure on the measurable space (X,A). 
Then there exists a measurable function h with |h| = 1 on X such that 
du = hd\y. 


Proof. Since |u(E)| < |u|(£) for all E € A, it follows that uw « |p|, and 
therefore, by Theorem 1.45 (since || is a finite positive measure), there exists 
h € L*(|y|) such that du = hd|y|. For each E € A with |y|(E) > 0, 


1 _ |B) 
acm [raul = Tale) => 


By Lemma 1.38, it follows that |h| < 1 |ju|-a.e. 


46 1. Measures 


We wish to show that ||({|h| < 1]) = 0. Since [|h| < 1] =U, []h| < 1-1/nl, 
it suffices to show that |u|({|h| < r]) = 0 for each r < 1. Denote A = ||h| < 7] 
and let {£;} be a partition of A. Then 


S- |W(Ex)| = S> 
k k 


i hal <r |ul(Bx) = r|ul(A). 
Ex Pa 


Hence 
|u|(A) < r|u|(A), 


so that indeed |j|(A) = 0. 
We conclude that |h| = 1 |p\-a.e., and since h is only determined a.e., we may 
replace it by a |ju|-equivalent function which satisfies |h| = 1 everywhere. 


If 4 is a complex measure on A and f € L1({ju|), there are two natural ways 
to define fy f du. One way uses decomposition (5) of y as a linear combination 
of four finite positive measures uz, which clearly satisfy uw, < |u|. Therefore 
f € I(uq) for all k = 0,...,3, and we define f, fdu = S3_yi* f fdps. A 
second possible definition uses Theorem 1.46. Since |h| = 1, fh € L*(|y|), and 
we may define [, fdu = fy fhd|y|. One verifies easily that the two definitions 
above give the same value to the integral y f du. The integral thus defined is a 
linear functional on L*(|y|). As usual, f,, f du := fy fle du for E € A. 

By Theorem 1.46, every complex measure \ is of the form dA = g du for some 
(finite) positive measure yz (= ||) and a uniquely determined g € L1(). 

Conversely, given a positive measure 4 and g € L*(u), we may define 
dX := gdu (as before, the meaning of this symbolic relation is that A(E) = 
Jeg MU for all E € A). If {E,} is a partition of E, and F, = Uz_, Ex, then 


AF) = [ote. du = [Lote du 
k=1 


Sun) 
k=1 


Since glp, — glIg pointwise as n — oo and |gIr,| < |g| € L*(u), the 
dominated convergence theorem implies that the series )77_, \(E%) converges 
to [ gIz du := X(E£). Thus, \ is a complex measure. The following theorem gives 
its total variation measure. 


Theorem 1.47. Let (X,A,) be a o-finite positive measure space, g € L*(), 
and let \ be the complex measure dX := gdu. Then d\A| = |g| du. 


Proof. By Theorem 1.46, d\ = hd|A| with h measurable and |h| = 1 on X. For 


al HE A 
[radu = | har= | hhd|A| = |A\(£). (*) 
E E E 


1.10. Convergence A7 


Therefore, whenever 0 < p(E) < ov, 

1 / : |AI(E) 
—— | hgdu= > 0. 
ME) Je u(E) 


By Lemma 1.38, hg > 0 p-a.e. Hence hg = |hg| = |g| p-a.e., and therefore, 
by (*), 


AE) = f Iolae (Be A). 


(The o-finiteness hypothesis can be dropped: use Proposition 1.22 instead of 
Lemma 1.38.) 

If yw is a real measure and yp = pt — po is its Jordan decomposition, the 
following theorem expresses the positive measures * and p~ in terms of a 
decomposition of the space X as the disjoint union of two measurable sets A and 
B that carry them (respectively). In particular, wt L p-. 


Theorem 1.48 (Hahn decomposition). Let : be a real measure on the 
measurable space (X,A). Then there exist disjoint sets A,B € A such that 
X =AUB and 

u(B)=KW(ENA), p(B) =—n(ENB) 
foradllE EA. 


Proof. By Theorem 1.46, dy = hd|yu| with h measurable and |h| = 1 on X. For 
all E € A with |u|(E£) > 0, 


1 | W(E) 

—— | hdypl= ER. 
M(B) Je || (E) 

By the averages lemma, it follows that h is real |u|-a.e., and since it is only 

determined a.e., we may assume that h is real everywhere on X. However, |h| = 1; 

hence A(X) = {-1,1}. Let A := [A = 1] and B = [h = —1]. Then X is the 

disjoint union of these measurable sets, and for all FE € A, 


yt (E) = (1/2)(|u|(Z) + w(2)) 


= (1/2) f +h) ala 


=f hdl =nEN A). 
ENA 


An analogous calculation shows that po (£) = —yp(EN B). 


1.10 Convergence 


This section considers some modes of convergence and relations between them. 
In order to avoid repetitions, (X, A, 1) will denote throughout a positive measure 
space; f, fn(m = 1,2,...) are complex measurable functions on X. 


48 1. Measures 


Definition 1.49. 


(1) fx converge to f almost uniformly if for any € > 0, there exists E € A 
with u(E) < ¢, such that f, > f uniformly on E*. 


(2) {fn} is almost uniformly Cauchy if for any « > 0, there exists EF € A with 
p(E) <e, such that {fp} is uniformly Cauchy on E°. 


Remark 1.50. 


(1) Taking « = 1/k with k =1,2,..., we see that if {f,,} is almost uniformly 
Cauchy, then there exist E, € A such that p(E,) < 1/k and {f,} is 
uniformly Cauchy on Ef. Let E = () Ex; then E € A and p(£) < 1/k for 
all k, so that u(£) = 0. If ¢ © ES = U Ef, then x € Ef for some k, so 
that {fn(a)} is Cauchy, and consequently dlim, f,(x) := f(a). We may 
define f(x) = 0 on E. The function f is measurable, and f, + f almost 
everywhere (since (E£) = 0). For any €,d > 0, let F € A, no € N be such 
that u(F) < € and |f, (x) — fm(x)| < 6 for all « € F° and n,m > no. 
Setting G = F° E*, we have p(G°) < «¢, and letting m — oo in the 
last inequality, we get |fn(x) — f(x)| < 6 for all e € G and n > no. Thus 
fn — f uniformly on G, and consequently f, — f almost uniformly. This 
shows that almost uniformly Cauchy sequences converge almost uniformly; 
the converse follows trivially from the triangle inequality. 


(2) A trivial modification of the first argument here shows that if f, > f 
almost uniformly, then f, — jf almost everywhere. In particular, the 
almost uniform limit f is uniquely determined up to equivalence. 


Definition 1.51. 


(1) The sequence {f,} converges to f in measure if for any € > 0, 
lim u([|fn — f| = 4) = 0. 
(2) The sequence {f,,} is Cauchy in measure if for any € > 0, 
dim, H(llfn — fml 2 €]) = 0. 


Remark 1.52. 


(1) If f. > f and f, — f’ in measure, then for any € > 0, the triangle 
inequality shows that 


Mf — £12 el) < K(f — fal 2 €/2]) + w([lfn — f"| 2 €/2]) 3 0 
as n — oo, that is, [|f — f’| > é] is p-null. Therefore, [f 4 f’] = U, [lf — 


f'| = 1/k] is p-null, and f = f’ a.e. This shows that limits in measure are 
uniquely determined (up to equivalence). 


(2) A similar argument based on the triangle inequality shows that if {f,,} 
converges in measure, then it is Cauchy in measure. 


1.10. Convergence 49 


(3) If fn > f almost uniformly, then fr > f in measure. Indeed, for any 
€,0 > 0, there exists E € A and no € N, such that p(E) < 6 and 
lfn — f| < € on E® for all n > no. Hence for all n > no, [|fn —f| > €] C E, 
and therefore ju([|fn — f| > €]) < 6. 


Theorem 1.53. The sequence {f,} converges in measure iff it is Cauchy in 
measure. 


Proof. By Remark 1.52.2, we need only to show the “if” part of the theorem. 
Let {fn} be Cauchy in measure. As in the proof of Lemma 1.30, we obtain 
integers 1 < ny < ng <nz,... such that (Ex) < (1/2*), where 


1 
Ey = Sev =~ Sri 2 |: 


The set Fin = Uns, Ex has measure < )>,5,, 27" = (1/2™—"), and on F%, we 
have for j >i >m - 


frog = fri 


j-1 j-1 it 
< S- | Fregica ~~ Frel < pes ss gi-1° 
ki kai 


This shows that {f,,} is almost uniformly Cauchy. By Remark 1.50, {fn,} 
converges almost uniformly to a measurable function f. Hence fn, — f in 
measure, by Remark 1.52.3. For any € > 0, we have 


[lfm — fl 2 C [fn — frgl 2 €/2] U [lfm — Fl 2 €/2I. (1) 


The measure of the first set on the right-hand side tends to zero when n, k > ov, 
since {f,} is Cauchy in measure. The measure of the second set on the right- 
hand side of (1) tends to zero when k — oo, since fn, — f in measure. Hence 
the measure of the set on the left-hand side of (1) tends to zero as n > oo. 


Theorem 1.54. If f, > f in L” for some p € [1, co], then fy > f in measure. 


Proof. For any « > Oand neEN, set En = [|fn — f| > €]. 
Case p < co. We have 


Pu(En) < i fn — FI? du < |Ifa — FIP, 


n 


and consequently f,, > f in L? implies u(E,,) > 0. 

Case p = oo. Let A = U,, An, where An = [|fn — f| > Ilfn — flloo]- By 
definition of the L*°-norm, each A, is null, and therefore A is null, and |f,—f| < 
l|.fn — flloo on A® for all n (hence f, — f uniformly on A® and pu(A) = 0; in such 
a situation, one says that f, > f uniformly almost everywhere). If f, > f in 
L®, there exists no such that || fn — fllo < € for all n > no. Thus |f, — f| < ¢€ 
on AS for all n > no, hence E,, Cc A for all n > no, and consequently F,, is null 
for all n > no. 


50 1. Measures 


1.11 Convergence on finite measure space 


On finite measure spaces, there exist some additional relations between the 
various types of convergence. A sample of such relations is discussed in this 
section. 


Theorem 1.55. Let (X,A,) be a finite measure space, and let f,:X — C be 
measurable functions converging almost everywhere to the (measurable) function 
f. Then fn > f in measure. 


Proof. Translating the definition of non-convergence into set theoretic 
operations, we have (on any measure space!) 


[fn does not converge to f] = U lim sup|| fn — f| > 1/&]. 
keN 


This set is null (ie., fp > f a.e.) iff limsup,|[|fn — f] > 1/k] is null for all k. 
Since p(X) < ov, this is equivalent to 


tne( U ln -fl> 1/41) =0 


for all k, which clearly implies that lim, pu([|fn — f| > 1/&]) = 0 for all k (ie., 
fn — f in measure). 


Remark 1.56. 


(1) Conversely, if f, — jf in measure (in an arbitrary measure space), 
then there exists a subsequence f,, converging a.e. to f (cf. proof of 
Theorem 1.53). 


(2) If the bounded sequence {f,} converges a.e. to f, then f, > f in L? 
for any 1 < p < o (by the proposition following Theorem 1.26; in 
an arbitrary measure space, the boundedness condition on the sequence 
must be replaced by its majoration by a fixed D?-function, not necessarily 
constant). 

(3) If 1<r<p<o, L?-convergence implies L’-convergence (by the second 
proposition following Theorem 1.26). 


Theorem 1.57 (Egoroff). Let (X,A, ) be a finite positive measure space, and 
let {fn} be measurable functions converging pointwise a.e. to the function f. 
Then fn — f almost uniformly. 


Proof. We first prove the following. 


Lemma (Assumptions and notation as in theorem). Given «,6 > 0, there 
exist A € A with p(A) < 6 and N € N such that |f, — f| < € on A® for all 
n>N. 


1.12. Distribution function 51 


Proof of lemma. Denote E, := ||fn — f| > €] and An := U, sy En. Then 
{An} is a decreasing sequence of measurable sets, and since yp is a finite 
measure, L((\y An) = limy p(An). Clearly f,(a) does not converge to f(x) 
when x € ()y Aw, and since f, > f a.e., it follows that u(()y An) = 0, that is, 
limy (Ay) = 0. Fix then N such that u(An) < 6 and choose A := Ay. Since 
AS =fln»swllfn — f| < ], the set A satisfies the lemma’s requirements. 


Proof of theorem. Given ¢,6 > 0, apply the lemma with ¢,, = 1/m and 6, = 
(6/2™), m =1,2,.... We get measurable sets A,, with u(Am) < dm and integers 
Nm, such that |f, — f| < 1/m on AS, for alln > Ny (m = 1,2,...). Let 
A := U,, Am; then p(A) < 6, and on A°(= (),, AF,), we have |f, — f| <1/m 
for alln > Nm, m=1,2,.... Fix an integer mo > 1/e, and let N := Nj,,; then 
lfn — f| <¢on A® for alln > N. 


1.12 Distribution function 


Definition 1.58. Let (X,A,,) be a positive measure space, and let f : X > 
(0, co] be measurable. The distribution function of f is defined by 


my) =HIf > yl) (y> 0). (1) 


This is a non-negative non-increasing function on Rt, so that m(oo) := 
limy5om(y) exists and is >0. We shall assume in the sequel that m is 
finite-valued and m(co) = 0. The finiteness of m implies that 


m(a) — m(b) = wl([a< f <b) (0<a<b< oo). 


Let {yn} be any positive sequence increasing to oo. If FE, := [f > yn], then 
Enyi C E, and (\E, = [f = oo]. Since m is finite-valued, we have by 
Lemma 1.11 


m(co) = lim m(yn) = lim (En) = w(()) En) = H(Lf = 00). 
Thus our second assumption here means that f is finite p-a.e. 


Both assumptions here are satisfied in particular when [ x f? du < oo for 
some p € [1,00). This follows from the inequality 


m(y) < (ii) (y > 0) (2) 


(cf. proof of Theorem 1.54). 


Theorem 1.59. Suppose the distribution function m of the non-negative 
measurable function f is finite and vanishes at infinity. Then: 


(1) For all p € [1, 0) = 
i: fPdu=— | y dm(y), (3) 
xX 0 


52 1. Measures 


where the integral on the right-hand side is the improper Riemann—Stieltjes 
integral 


b 
va 
din. | y dm(y). 
(2) If either one of the integrals — Ape y? dm(y) and p ihe y?—*m(y) dy is finite, 
then 


i gi Be p = 
lim y?m(y) = lim y?m(y) = 0, (4) 
and the integrals coincide. 


Proof. Let 0<a<b<coandneEN. Denote 


n 


ae Le a eet 
Yj a Joan? J ’ ’ ’ 


Ey =[yja1<f<yj); Eas =la< f < d; 
n2” 


Sn = ys yj-1ln, . 
j=l 


The sequence {s?} is a non-decreasing sequence of non-negative measurable 
functions with limit f? (cf. proof of Theorem 1.8). By the Monotone Convergence 
theorem, 


n2” 
[fede tien [hd = tis yh) 
Ea,b 1 JEa,b i j=l 
no 


b 
= =timn 7 yf afin(yy) mya) == fy amy). 


The first integral above converges to the (finite or infinite) limit f, f? du when 
a— 0 and b > ov. It follows that the last integral converges to the same limit, 
that is, the improper Riemann-Stieltjes integral i y” dm(y) exists and (3) is 
valid. 

Since —dm is a positive measure, we have 


b b 
0 < a? [m(a) — m(b)] = - / a? dmly) < - ‘ y dm(y), (5) 


hence 


1.13. Truncation 53 


In case f>°y?dm(y) is finite, letting a -—> oo in (7) shows that 
limy+ a?m(a) = 0. Also letting a > 0 in (6) (with 6 fixed arbitrary) shows 
that 


a0 


b 
0 < limsupa?m(a) < -{ y? dm(y). 
0 


Letting b > 0, we conclude that Jlim,_,9 a?m(a) = 0, and (4) is verified. 
An integration by parts gives 


b b 
= fy dinty) = ama) = vmn(d) +p | y*m(y) ay. (8) 


a 


Letting a + 0 and b — oo, we obtain from (4) (in case [5° y? dm(y) is finite) 


= [ y? dm(y) =p y?*m(y) dy. (9) 


Consider finally the case when ie y?—-!m/(y) dy < oo. We have 


b 
(1 — 2-P)bPm(b) = [bP — (b/2)?]m(b) = m(b) i ee dy 


b 
< »|f yr 'm(y) dy + 0 

b/2 
as b + oo or b > O (by Cauchy’s criterion). Thus (4) is verified, and (9) was 
seen to follow from (4). 


Corollary 1.60. Let f € L?(u) for some p € [1,co), and let m be the distribution 
function of |f|. Then 


fllp = -[- y? dm(y) = rf yP*m(y) dy. 


1.13 Truncation 


Technique 1.61. The technique of truncation of functions is useful in real 
methods of analysis. Let (X,A,) be a positive measure space, and let 
f:X —C. For each u > 0, we define the truncation at u of f by 


fu = FI pisy + USAF) I ej>u- (1) 
Denote 
f= f - fu =(F — WPF) )Ieisg = WF) - P/F Ip eisa- (2) 
We have 
| ful — lf \Zi pi<uy + uly isa] = min(|f]|,u), (3) 
ful = (fl — uly gisu, (4) 
f=futfhu — \fl=\ful + fil. (5) 


54 1. Measures 


It follows in particular that f € L?(:) for some p € [1, co] iff both f,, and fi, are 
in L?(y). In this case f,, € L" (ys) for any r > p, because |f,/u| < 1, so that 


wo" ful” = lfu/ul” < [fu/ul? <u? FP. 


Similarly (still when f € L?), f/ € L"(u) for any r < p. Indeed, write 


fiaru=f{ +f. 
x WALI>4 lo<lfils4] 


For r < p, the first integral on the right-hand side is 
cf ima < If 
(fil>1 


the second integral on the right-hand side is 


S w([0 < |ful $Y) = u((0 < |f|-us 1) 
= p(lu < |f| <u+1)) = m(u) — m(u + 1). 


Thus, for any r < p, 
Ifullr < lfullb + m(u) — m(u + 1) < 00, 
as claimed. 


Since |f.,| = min(|f],u), we have [|fu| > y] = [|f] > y] whenever 0 < y < u 
and ||fu| > y] = @ whenever y > u. Therefore, if m, and m are the distribution 
functions of | f,,| and |f|, respectively, we have 


Muly)=m(y) for0O<y<u; Muly) =0 for y>u. (6) 
For the distribution function m/, of |f/,|, we have the relation 


f 


m,(y)=my+u) (y>9), (7) 


since by (4) 


may) = w((lfal > yl) = wf —u > y]) =a fl > y+ ul) = my + uv). 


By (6), (7), and Corollary 1.60, the following formulae are valid for any f € L?(j1) 
(1<p< oo) andu>0: 


Ifullp =o fo Im(o) do Co (8) 


Ilfullr =r fw wi'm(o) du (r<p). (9) 


These formulae are used in Section 5.40. 


Exercises 595 


Exercises 


1. Let (X,A, 4) be a positive measure space, and let f be a non-negative 
measurable function on X. Let E := [f < 1]. Prove 


(a) w(E) = lim, J, exp(—f”) du. 
(b) Drain fe = fn f/ — f)) du. 


2. Let (X,A,) be a positive measure space, and let p,q be conjugate 
exponents. Prove that the map [f,g] € L?(u) x L%(u) > fg € L'(p) 
is continuous. 


3. Let (X,A,p) be a positive measure space, f, : X — C measurable 
functions converging pointwise to f, and h : C > C continuous and 
bounded. Prove that lim, J, h(fn) du = fp h(f) du for each E € A with 


finite measure. 


4. Let (X,A,,) be a positive measure space, and let B C A be a o-finite 
o-algebra. If f € L1(A) := L'(X, A, ), consider the complex measure on 
B defined by 


A, (E) =f fay (E € B). 
Prove: 


(a) There exists a unique element Pf € L1(B) := L'(X,B, u) such that 
AE) = f (Pfde (BEB). 


(b) The map P : f > Pf is a continuous linear map of L1(A) onto the 
subspace L'(B), such that P? = P (P? denotes the composition of P 
with itself). In particular, L1(B) is a closed subspace of L1(A). 


5. Let (X,A,y) be a finite positive measure space and f,, € L?() for all 
n € N (for some p € [1,00)). Suppose there exists a measurable function 
f :X > C such that sup,, supy |fn — f| < co and f, — f in measure. 
Prove that f € L?(y) and f, > f in L?-norm. 


6. Let A and p be positive o-finite measures on the measurable space (X, A). 
State and prove a version of the Lebesgue-Radon—Nikodym theorem for 
this situation. 


7. Let {A,} be a sequence of complex measures on the measurable space 
(X,A) such that 5°, ||An|| < 00. Prove 


(a) For each E € A, the series 7, An(£) converges absolutely in C and 
defines a complex measure ); the series )>,, |An|(E) converges in R*, 
and defines a finite positive measure a, and \<o. 


56 


10. 


1. Measures 


(b) 
ay <a0%, 


do ~ 24 de” 


. Let (X, A, 1) be a positive measure space, and let M := M(.A) denote the 


vector space (over C) of all complex measures on A. Set 


Ma:={A € M;rA<K p}; 
Ms:={X\ € M;X 1 p}. 


Prove: 


(a) If \ € M is supported by FE € A, then so is |)|. 

(b) M, and M, are subspaces of M and M, L M, (in particular, M,N 
Mz = {0}). 

(c) If (X, A, ») is o-finite, then M = M, © Ms. 

(d) AX © M, iff |A] © M, (and similarly for M,). 

(e) If A, € M (k =1,2), then Ay L Az iff |A1| L Aa}. 

(f) \ < p iff for each € > 0, there exists 6 > 0 such that |A(E)| < ¢ for 
all E € A with p(E) < 6. 


(Hint: if the €, 6 condition fails, there exist E, € A with u(E£,) < 1/2” 
such that |\(E,,)| > € (hence |A|(E,) > €), for some € > 0; consider the 
set & = limsup Ep.) 


. Let (X,A, 4) be a probability space (i.e., a positive measure space such 


that p(X) = 1). Let f,g be (complex) measurable functions. Prove that 
fllaliglla = infx |fgl. 


Let (X,A, 4) be a positive measure space and f a complex measurable 
function on X. 


(a) If u(X) < oo, prove that 
jim. Ill = II flloo: (*) 


(The cases ||f||.. = 0 or oo are trivial; we may then assume that 
[fll = 1; given ¢, there exists E € A such that u(F) > 0 and 
(1—€)u(E)/? < [Ifllp < w(X)””,) 

(b) For an arbitrary positive measure space, if || f||,, < oo for some r € 
[1,00), then (*) is valid. 


(Consider the finite positive measure v(E) = J,,|f|" du. We may 
assume as in part (a) that ||fllo = 1. Verify that ||f||zo~) = 1 and 


lflle = Ifllgo2/7,) for all p > r +1.) 


Exercises 57 


11. 


12. 


13. 


14. 


15. 


16. 


17. 


Let (X, A, 14) be a positive measure space, 1 < p < co, ande > 0. 


(a) Suppose f,,, f are unit vectors in L?(y) such that f, > f a.e. Consider 
the probability measure dv = |f|? du. Show that there exists E € A 
such that f,/f — 1 uniformly on E and v(E°) < e. (Hint: Egoroff’s 
theorem.) 

(b) For £ as in part (a), show that limsup,, fi. |fn/? du < €. 


c) Deduce from parts (a) and (b) that fy, > f in L?(u)-norm. 


— a 
Q 
New) 


If gn,g € L”(u) are such that g, > g a.e. and ||gn||p > ||g||p, then 
gn — g in L?()-norm. (Consider fn = gn/||9n|lp and f = g/l|g|lp-) 


Let (X,A,,) be a positive measure space and f € L?() for some p € 
[1, 00). Prove that the set [f 4 0] has o-finite measure. 


Let (X,A) be a measurable space, and let f, : X > C, ne€N, be 
measurable functions. Prove that the set of all points x € X for which the 
complex sequence {f;,(a)} converges in C is measurable. 


Let (X,A) be a measurable space, and let E be a dense subset of R. 
Suppose f : X — R is such that [f > c] € A for all c € E. Prove that f is 
measurable. 


Let (X,A) be a measurable space, and let f : X — Rt be measurable. 
Prove that there exist c, > 0 and Ex, € A (k € N) such that f = 
1 cele, Conclude that for any positive measure p on A, f fdu = 
Se ceM(Ex); in particular, if f € L'(y), the series converges (in the 
strict sense) and u(F,) < co for all k. (Hint: get s,, as in the approximation 
theorem, and observe that f = )7,,(8n — $n—1)-) 


Let (X, A, ) be a positive measure space, and let {F;,} C A be such that 
Yo, (Ex) < co. Prove that almost all ¢ € X lie in at most finitely many 
of the sets E,. (Hint: the set of all z’s that lie in infinitely many E;s is 
lim sup Ex.) 


Let X be a (complex) normed space. Define 


Ile + yll? + lle = yl? 


F@9) = "Fel QI 


(x,y EX). 


(We agree that the fraction is 1 when x = y = 0.) Prove: 


(a) 1/2<f<2. 
(b) X is an inner product space iff f = 1 (identically). 


2 


Construction of measures 


This chapter introduces Constantin Carathéodory’s powerful technique for 
constructing positive measures from primitive objects called semi-measures on 
semi-algebras. In contrast to g-algebras, which are normally “big” and intangible, 
semi-algebras are often “small” and concrete. The first four sections cover 
the development of Caratheodory’s method, including a structure theorem 
characterizing measurability of sets. Next, we use this method to construct the 
Lebesgue—Stieltjes measures, a special case of which is the Lebesgue measure, 
which is arguably the most important example in basic mathematics of a non- 
trivial measure. This measure is the “natural” measure on R in the sense that it 
maps an interval to its length. Intuitively, the Lebesgue measure yu is translation 
invariant: p(t + &) = y(£) for measurable E C R and t € R. This makes it the 
Haar measure of R; see Chapter 4. 

We prove that every Riemann integrable complex function on an interval 
[a,b] is also Lebesgue integrable there with respect to the Lebesgue measure 
(the converse being evidently false), a result mentioned in Chapter 1. 

The final section applies Caratheodory’s extension theorem to construct the 
product of two positive measure spaces. The two most fundamental results are 
Fubini’s and Tonelli’s theorems, roughly saying that under suitable conditions, 
the “double integral” equals both “iterated” integrals. 


2.1 Semi-algebras 


The purpose of this chapter is to construct measure spaces from more primitive 
objects. We start with a semi-algebra C of subsets of a given set X and a semi- 
measure {4 defined on it. 


Definition 2.1. Let X be a (non-empty) set. A semi-algebra of subsets of X 
(briefly, a semi-algebra on X) is a subfamily C of P(X) with the following 
properties: 


(1) if A,B EC, then ANBEC; DEC; 


Introduction to Modern Analysis. Second Edition. Shmuel Kantorovitz and Ami Viselter, Oxford University Press. 
© Shmuel Kantorovitz and Ami Viselter (2022). DOI: 10.1093/os0/9780192849540.003.0002 


60 2. Construction of measures 


(2) if AEC, then A® is the union of finitely many mutually disjoint sets in C. 


Any algebra is a semi-algebra, but not conversely. For example, the family 
C = {(a,b]; a,b € R} U {(—0o, bj; b € R} U {(a, 00); a € R}U {D9} 


is a semi-algebra on R, but is not an algebra. Similar semi-algebras of half-closed 
cells arise naturally in the Euclidean space R*. 


Definition 2.2. Let C be a semi-algebra on X. A semi-measure on C is a function 
pu: C — [0,00] 
with the following properties: 


(1) u(0) = 0; 

(2) if B; € C, i = 1,...,n are mutually disjoint with union F € C, then 
MW(E) = D0, w(Ei); 

(3) if B; € C, « = 1,2,... are mutually disjoint with union EF € C, then 
aE) < oy ul Ei). 


If C is a o-algebra, any measure on C is a semi-measure. A simple “natural” 
example of a semi-measure on the semi-algebra of half-closed intervals on R 
mentioned above is given by 


u((a,b])=b-—a, a,bER, a<bdb; 
(0) =0; — u((—00, b]) = (a, 00)) = 00. 


Let C be a semi-algebra on X, and let A be the family of all finite unions 
of mutually disjoint sets from C. Then 0 € A; if A= UE,,B = UF; € A, 
with H; € C, 1 = 1,...,m disjoint and F; € C, j = 1,...,n disjoint, then 
ANB =U; ; HOF; € Aas a finite union of disjoint sets from C by Condition (1) 
in Definition 2.1. Also A° = ()E¢ € A, since Ef € A by Condition (2) in 
Definition 2.1, and we just saw that A is closed under finite intersections. We 
conclude that A is an algebra on X that includes C, and it is obviously contained 
in any algebra on X that contains C. Thus, A is the algebra generated by the 
semi-algebra C (i.e., the algebra, minimal under inclusion, that contains C). 


Definition 2.3. Let A be any algebra on the set X. A measure on the algebra 
A is a function ys: A — [0,00] such that 14(0) = 0, and if & € A is the countable 
union of mutually disjoint sets E; € A, then (A) = >> u(£;) (i-e., w is countably 
additive whenever this makes sense). 


Theorem 2.4. Let C be a semi-algebra on the set X, let js be a semi-measure 
on C, let A be the algebra generated by C, and extend ys to A by letting 


w( Uz) = Dae (1 


2.1. Semi-algebras 61 


for FE, €C, i=1,...,n, mutually disjoint. Then p is a measure on the algebra 


A. 


Proof. First, js is well-defined by (1) on A. Indeed, if E;, F; € C are such that 
A=UE,; =UF; (finite disjoint unions), then each F; € C is the finite disjoint 
union of the sets E; 1 F; € C; by Condition (2) in Definition 2.2, 


M(E5) = > BE; 9 F;), 


and therefore 


Sua) = ME NF). 


The symmetry of the right-hand side in E; and F; implies that it is also equal 
to )> w(£;), and the definition (1) is indeed independent of the representation 
of A € Aas a finite disjoint union of sets in C. 

It is now clear that py is finitely additive on A, hence monotonic. 

Let FE € A be the disjoint union of FE; € A,i = 1,2,.... For each n € N, 
Uj, Ei C E, hence 


and therefore 


> M(B) < w(E). (2) 


Next, write E; = U; F,;, a finite disjoint union of Fi; € C, for each fixed 2, and 
similarly, since E € A, E = UG, a finite disjoint union of sets G, € C. Then 
G, € C is the (countable) disjoint union of the sets Fi;9Gx € C (by Condition (1) 
in Definition 2.1), over all 7,7. By Condition (3) in Definition 2.2, it follows that 
for all k, 
u(Gr) < S° u(Fig 9 Gr). 
iJ 


Hence 


WE) = S> (Ge) < So w( Fig Gr) 
i,j,k 
=o oF 0 Gx) = > M(Ei); 


i j,k 


by the definition (1) of 4 on A, because E; = U;;, Fiz 1 Ge, a finite disjoint 
union of sets in C. Together with (2), this proves that ju: is indeed a measure on 
the algebra A. 


62 2. Construction of measures 


2.2 Outer measures 


A measure on an algebra can be extended to an outer measure on P(X). 
Definition 2.5. An outer measure on the set X (in fact, on P(X)) is a function 
b+ P(X) = (0, oo] 

with the following properties: 
(1) *(@) = 0; 
(2) u* is monotonic (ie., u*(F) < p*(F’) whenever EF C FC X); 
(3) p* is countably subadditive, that is, 


w (UB) < ee), 


for any sequence {F£;} C P(X). 


By (1) and (3), outer measures are finitely subadditive (i.e., (3) is valid for 
finite sequences {E;} as well). 


Theorem 2.6. Let be a measure on the algebra A of subsets of X. For any 
E € P(X), let 


ww (EB) := inf > LE), 


where the infimum is taken over all sequences {E;} C A with E C UE; (briefly, 
call such sequences “A-covers of E”). Then u* is an outer measure on X, called 
the outer measure generated by yu, and p*|,4 = pL. 


Proof. We begin by showing that p*|4 = pu. If E € A, then {F,0,0,...} is an 
A-cover of E, hence p*(E) < p(£). Next, if {£;} is any A-cover of E, then for 
alln EN, 

Fy = EN Ex N Efi Ee A 


(since A is an algebra), and E € A is the disjoint union of the sets Fy, C En. 
Therefore, since ps is a measure on the algebra A, 


SS u(En) = > w(Fn) = w(E). 


Taking the infimum over all A-covers {E,,} of E, we obtain p*(E) > w(E), and 
the wanted equality follows. 

In particular, *(0) = u(@) = 0. 

If EC F CX, then every A-cover of F' is also an A-cover of /; this implies 
that p*(E) < u*(F). 

Let E, C X,n €N, and E = U,, En. For € > 0 given, and for each n € N, 
there exists an A-cover {E,,;}; of E, such that 


> HMEn,i) <p (En) + €/2”. 


2.2. Outer measures 63 


Since {E,,;;n,i © N} is an A-cover of EF, we have 


u(E) < > W(En i) < >> ul (En) +6, 


and the arbitrariness of € implies that u* is countably sub-additive. 


Definition 2.7 (The Caratheodory measurability condition). Let p* be 
an outer measure on X. A set EC X is y*-measurable if 


u"(A) = w"(AN B) + wX(ANE®) a 
for every AC X. 


We shall denote by M the family of all .*-measurable subsets of X. 
By subadditivity of outer measures, (1) is equivalent to the inequality 


y*(A) > "(AN E) + p"(AN E*) (2) 


(for every A C X). Since (2) is trivial when y*(A) = oo, we can use only subsets 
A of finite outer measure in the measurability test (2). 


Theorem 2.8. Let u* be an outer measure on X, let M be the family of all 
p*-measurable subsets of X, and let fi := u*|m. Then (X,M, ji) is a complete 
positive measure space (called the measure space induced by the given outer 
measure). 


Proof. If u*(£) = 0, also u*(AN E) = 0 by monotonicity (for all A C X), and 
(2) follows (again by monotonicity of u*). Hence FE € M whenever p*(E) = 0, 
and in particular § € M. By monotonicity, this implies also that the measure 
space of the theorem is automatically complete. 

The symmetry of the Caratheodory condition in EF and E* implies that E° € 
M whenever E € M. 

Let E, F € M. Then for all A CX, it follows from (2) (first for F’ with the 
“test set” A, and then for EF with the test set ANF) and the finite subadditivity 
of u*, that 


ur(ANF)+u"(AnF*) 
>W(ANF)+W(AN FN E)+W(AN FOE’) 
> el 

= (AN (EUF))+p(An (EU F)*), 


and we conclude that E UF € M, and so M is an algebra on X. It follows 
in particular that any countable union E of sets from M can be written as a 
disjoint countable union of sets E; € M. Set Fy, = Uj, Ei(C E),n = 1,2,.... 
Then F,, € M, and therefore, by (2) and monotonicity, we have for all AC X 
u"(A) > we (AN Fr) + w(AN Fr) 
> p"(AN Fa) + u*(AN B®). (3) 


64 2. Construction of measures 


By (1), since E,, € M, we have 


W(AN F,) = wW(AN PF, E,) + Ww (AN F, 9 ES) 
=p (AN E,) + (AN Fr-1). (4) 


The recursion (4) implies that for all n € N, 
*(AN Fh) = Soo ANE) (5) 
Substitute (5) in (3), let n — oo, and use the o-subadditivity of u*; whence 
(A) > Soat(AN EB) + u"(AN B®) 


> "(AN B) ty" *(AN E°). 


This shows that E € M, and we conclude that M is a o-algebra. 
Choosing A = F,, in (5), we obtain (by monotonicity) 


ee) Sir) = Ye Ea) WH 1 2s 
Letting n — oo, we see that 


>So w(E 
t=1 


Together with the o-subadditivity of u*, this proves the o-additivity of p* 
restricted to M, as wanted. 


2.3 Extension of measures on algebras 


Combining Theorems 2.6 and 2.8, we obtain the Caratheodory extension theorem. 


Theorem 2.9. Let u be a measure on the algebra A, let u* be the outer measure 
induced by 1, and let (X,M, ji) be the (complete, positive) measure space induced 
by u*. Then AC M, and fi extends to a measure (on the o-algebra M), which 
is finite (o-finite) if us is finite (o-finite, respectively). 


The measure f/ is called the Caratheodory extension of pL. 


Proof. By Theorems 2.6 and 2.8, we need only to prove the inclusion A Cc M, 
for then 


fla =(u"|M)la = Ha = 


2.4. Structure of measurable sets 65 


Let then E € A, and let A Cc X be such that y*(A) < oo. For any given ¢€ > 0, 
there exists an A-cover {£;} of A such that 


A) +>) /n(Ei). (1) 


Since E € A, {E; 0 E} and {E; M E%} are A-covers of AN E and AN E*, 
respectively, and therefore 


S37 (Ein BE) > w*(AN BE) 


and 
S 7 (Bi E®) > w*(AN E*). 


i 
Adding these relations and using the additivity of the measure y on the algebra 
A, we obtain 


dw )> w(AN BE) +p (ANE). (2) 
By (1), (2), and the arbitrariness of €, we get 
(A) > wt(ANE) 4 u"(AN B®) 
for all AC X, so that HE M, and AC M. 


If ys is finite, then since X € A and p*|4 = pw, we have p*(X) = u(X) < ow. 
The o-finite case is analogous. 


If we start from a semi-measure {4 on a semi-algebra C, we first extend it to a 
measure (same notation) on the algebra A generated by C (as in Theorem 2.4). 
We then apply the Caratheodory extension theorem to obtain the complete 
positive measure space (X,M, ji) with AC M and fi extending py. Note that if 
uw is finite (o-finite) on C, then its extension fi is finite (o-finite, respectively). 


2.4 Structure of measurable sets 


Let ys be a measure on the algebra A on X. Denote by pu* the outer measure 
induced by ps, and let M be the o-algebra of all u*-measurable subsets of X. 
Consider the family A, C M of all countable unions of sets from A. Note that 
if we start from a semi-algebra C and A is the algebra generated by it, then 
A, =Cg. 


Lemma 2.10. For any E C X with p*(E) < co and for any € > 0, there exists 
A€é A, such that EC A and 


u’(A) <w(E) +e 


66 2. Construction of measures 


Proof. By definition of u*(£), there exists an A-cover {F;} of EF such that 
So w(E;) < w*(£) +e. Then A:= UE; € A,, EC A, and 


H(A) < 0 u"(Ei) = aE) < w(B) +6, 


as wanted. 


If B is any family of subsets of X, denote by Bs the family of all countable 
intersections of sets from B. Let Ags := (Az)s. 


Proposition 2.11. Let w,A,pu* be as before. Then for each FE C X with 
pu“ (E) < oo, there exists A € Ags such that E C A and u*(E) = u*(A)(= fi(A)). 


Proof. Let E be a subset of X with finite outer measure. For each n € N, there 
exists A, € A, such that E C A, and p*(A,) < w*(E)+1/n (by Lemma 2.10). 
Therefore, A :=(] An € Ags, EC A, and 


B(E) < w"(A) < w*(An) S$ w(E) + 1/n 


for all n, so that u*(E) = p*(A). 
The structure of y*-measurable sets is described in the next theorem. 


Theorem 2.12. Let ~ be a o-finite measure on the algebra A on X, and let y* 
be the outer measure induced by it. Then E C X is u*-measurable iff there exists 
A€ Ags such that EC A and p*(A-— FE) =0. 


Proof. We observed in the proof of Theorem 2.8 that M contains every set of 
p*-measure zero. Thus, if EF C A € Ags and u*(A— FE) = 0, then A-E ECM 
and A € M (because Ags C M), and therefore FE = A—(A-— E) € M. 

Conversely, suppose E € M. By the o-finiteness hypothesis, we may write 
X = UX; with X; € A mutually disjoint and w(X;) < oo. Let Bj = EN Xj. 
By Lemma 2.10, there exist A,; € A, such that E; C A,; and p*(Ayj) < 
u*(E;) + (1/n2"), for all n,i € N. Set An := U, Ani. Then for all n, An € Ag, 
EC An, and A, — EC U, (Ani = E;), so that 


"(An = B) $0 ut (Ani = Bi) SO = An 


n2 


Let A :=()An. Then E Cc A, A € Ajs, and since A— E C A, — F for all n, 
u*(A — E) =0. 


We can use Lemma 2.10 to prove a uniqueness theorem for the extension of 
measures on algebras. 


Theorem 2.13 (Uniqueness of extension). Let j1 be a measure on the algebra 
A on X, and let ji be the Caratheodory extension of 4 (as a measure on the 
a-algebra M, cf. Theorem 2.9). Consider the o-algebra B generated by A (of 
course, B C M). If uy is any measure that extends yu to B, then y4(E) = fi(E£) 
for any set EF € B with fi(E) < oo. If p is o-finite, then uw = fi on B. 


2.5. Construction of Lebesgue—Stieltjes measures 67 


Proof. Since 1 = p = fron A, and each set in A, is a disjoint countable union 
of sets A; € A, we have pu, = fi on A,. 

Let E € B with ji(E) < o, and let « > 0. By Lemma 2.10, there exists 
Aeé A, such that EC A and fi(A) < fi(E) + €. Hence 


fi(E) < pa(A) = B(A) S AE) +6, 


and therefore, j1(E) < fi(£), by the arbitrariness of e. 
Note in passing that A— E € B with f(A — E) < € < o (for any A as 
discussed). Therefore, we have in particular 41(A — E) < fi(A— E) < e«. Hence 


BE) < f(A) = (A) = fn (E) + i (A- E) S wi (FE) +6, 


and the reverse inequality fi(E) < «1(£) follows. 

If is o-finite, write X as the disjoint union of X; € A with u(X;) < 0, i= 
1,2,.... Then, each E € B is the disjoint union of FE; := EN X; with p(E;) < co; 
since 4(£;) = fi(£;) for all i, also wi(E) = fi(E), by o-additivity of both 
measures. 


2.5 Construction of Lebesgue—Stieltjes 
measures 


Here we apply the general method of construction of measures described 
in the preceding sections to the special semi-algebra C in the example 
following Definition 2.1, and to the semi-measure p induced by a given 
non-decreasing right-continuous function F : R — R. Denote F(co) := 
lime oo F(x)(€ (—00, o0]), and similarly F(—oo)(€ [-00, 00)) (both limits exist, 
because F' is non-decreasing). We define the semi-measure js (induced by F’) by 


(0) = 0; ((a,6]) =F) - Fla) (abe R,a <b); 
1((—00, b]) = F(b) — F(—00);  u((a, 00)) = F(co) — F(a),a,bE R. 
The example following Definition 2.2 is the special case with F(x) = a,x € R. 
We verify that the properties (2) and (3) of Definition 2.2 are satisfied. 


Suppose (a, b] is the disjoint finite union of similar intervals. Then we may 
index the subintervals so that 


a= ay <b) = ag < bp = a3 < +++ < bn =D. 


Therefore, 
>> H(i, bi) = SOLFO) — F(a)] = SF (aia) F(a;)| + (6) — F(an) 


68 2. Construction of measures 


for a,b € R,a < b. A similar argument for the cases (—oo, b] and (a, oo) completes 
the verification of Property (2). In order to verify Property (3), we show that 
whenever (a, b] C U3, (ai, bi], then 


F(b)— Fla) s DIF (bi) — F(a;)).- (1) 


This surely implies Property (3) for a,b finite. If (—oo, b] is contained in such a 
union, then (—n, b] is contained in it as well, for all n € N, so that F'(b) — F(—n) 
is majorized by the sum on the right-hand side of (1) for all n; letting n > on, 
we deduce that this sum majorizes j4((—oo, b]). A similar argument works for 
j1((a,00)). 

Let € > 0. By the right continuity of F’, there exist c;,7 = 0,1,2,... such that 


a<co; F(co) <Fla)+e6 


bi < ci; F(c:) < F(bi) + €/2'; MES ks (2) 


We have [co,b] C UF, (aici), so that, by compactness, a finite number n of 
intervals (a;,c;) covers [co, 6]. Thus, co is in one of these intervals, say (a1, c1) 
(to simplify notation), that is, 


ay <Co < Cy. 
Assuming we got (a;,c;),1 <7 < k such that 
ai < Cj-1 < Gi, (3) 


and cg_-1 < 6 (that is, cp.-1 € [co, b]), there exists one of the n intervals shown, 
say (ax, cx) to simplify notation, that contains cp._1, so that (3) is valid for i = k 
as well. This (finite) inductive process will end after at most n steps (this will 
exhaust our finite cover), which means that for some k <n, we must get 


b< cy. (4) 
By (4), (8), and (2) (in this order), 
k 
t((a, 8) : = F(b) — F(a) < F(ce) + [F(ci-1) F(a)] — F(co) + € 


[F(ci) — F(ai)| +e < DIF (ei) — F(ai)| + 


we 


Il 
un 


= 


[F'(bi) — F(ai)] + 2€ = DE H((ai, bi]) + 2€, 


c 


Il 
un 


as wanted (by the arbitrariness of €). By Theorems 2.4 and 2.9, the semi-measure 
js has an extension as a complete measure (also denoted j) on a o-algebra M 


2.5. Construction of Lebesgue—Stieltjes measures 69 


containing the Borel algebra B (= the o-algebra generated by C, or by the 
algebra A). By Theorem 2.13, the extension is uniquely determined on B. The 
(complete) measure space (R,M,:) is called the Lebesgue—Stieltjes measure 
space induced by the given function F (in the special case F(x) = «, this is 
the Lebesgue measure space). By Theorem 2.13, the “Lebesgue-Stieltjes measure 
p induced by F” is the unique measure on 6 such that p((a, b]) = F'(b) — F(a). 
It is customary to write the integral { f du in the form { f dF, and to call F’ the 
“distribution of uw”. Accordingly, in the special case of Lebesgue measure, the 
described integral is customarily written in the form f[ f dz. 

The Lebesgue measure pz is the unique measure on B such that p((a, d]) 
= b—a for all real a < 0; in particular, it is translation invariant on C, hence 
on A (if E € A, write E as a finite disjoint union of intervals (a;,b;], then 
u(t +E) = WU, {t + (aisbil} =X, w((t + a3, + bil) = Do w((ai, bi) = w(B) for 
all real t). Let u* be the outer measure induced by yp. Then for all FE C R and 
t ER, {£;} is an A-cover of F if and only if {t+ E;} is an A-cover of t+ E, and 
therefore 


w(t+ EB) := inf S> w(t + E;) = inf S> p(B;) = p*(E£). 


In particular, if y*(£) = 0, then y*(t + £) = 0 for all real t. By Theorem 2.12, 
E € M (that is, Lebesgue measurable on R) iff there exists A € A,5 such that 
E Cc Aand p*(A— E) = 0. However, translations of unions and intersections 
of sets are unions and intersections of the translated sets (respectively). Thus, 
the existence of A € Ags as shown implies that t+ A € Ags, (+E Ct+A, 
and p*((t + A) — (t+ £)) = w*(t+ (A —- E)) = 0, that is, t+ EB © M (by 
Theorem 2.12), for all ¢ € R. Since Lebesgue measure is the restriction of p* to 
M, we conclude from this discussion that the Lebesgue measure space (R,M, ju) 
is translation invariant, which means that t+ E € M and p(t+ E) = u(E) for 
allt € Rand E € M. In the terminology of Section 1.26, the map h(x) = «—tis 
a measurable map of (R, M) onto itself (for each given t), and the corresponding 
measure v(E) := u(h~1(E)) = w(t + E) = u(£). Therefore, by the Proposition 
there, for any non-negative measurable function and for any p-integrable complex 


function f on R, 
ftau= f tean, 


where fi(z) := f(a —t). This is the translation invariance of the Lebesgue 
integral. 

Consider the quotient group R/Q of the additive group R by the subgroup 
Q of rationals. Let A be an arbitrary bounded Lebesgue measurable subset of 
R of positive measure. By the Axiom of Choice, there exists a set E C A that 
contains precisely one point from each coset in R/Q that meets A. Since A is 
bounded, A Cc (—a,a) for some a € (0,00). We claim that A is contained in the 
disjoint union S := U,-<@n(—2a,2a) (7 + E). Indeed, if « € A, there exists a unique 
y € E such that x,y are in the same coset of Q, that is, x -y =r € Q, hence 
c=rtyert+E, |r| < |x|+|y| < 2a, so that indeed x € S. If r,s € QN(—2a, 2a) 
are distinct and x € (r+ FE) (s+), then there exist y,z € E such that 


70 2. Construction of measures 


xe=rt+y=s+z. Hence, y—2z=s-—r € Q-— {0}, which means that y, z are 
distinct points of E’ belonging to the same coset, contrary to the definition of 
E. Thus, the union S is indeed a disjoint union. Write QM (—2a, 2a) = {rx}. 
Suppose EF is (Lebesgue) measurable. Since rg +E C rp +A C (—3a, 3a) for all k, 
it follows that S is a measurable subset of (—3a, 3a). Therefore, by o-additivity 
and translation invariance of p, 


6a > u(S) = So wlre +B) > 0 w(E) = n(E) 
k=1 k=1 


for all n € N, hence u(F) = 0 and p(S) = 0. Since A C S, also u(A) = 0, 
contradicting our hypothesis. This shows that E is not Lebesgue measurable. 
Since any measurable set on R of positive measure contains a bounded measurable 
subset of positive measure, we proved the following 


Proposition. Every (Lebesgue-) measurable subset of R of positive measure 
contains a non-measurable subset. 


2.6 Riemann vs. Lebesgue 


Let —co <a <b< om, and let f : [a,b] + R be bounded. Denote 


m= inf f; M=supf. 
[a0] [a,b] 
Given a “partition” P = {x,;k =0,...,n} of [a,b], where a = 1% < 41 <::- < 
In = b, we denote 
m,= inf f; M,= sup f; 
[tx—1,08] [T1508] 
n n 
Lp= S- moe — tei); Up = S- Milt, apa): 
k=1 k=1 
Recall that the lower and upper Riemann integrals of f over [a,b] are defined as 
the supremum and infimum of Lp and Up (respectively) over all partitions P, 
and f is Riemann integrable over [a, b] if these lower and upper integrals coincide 
(their common value is the Riemann integral, denoted Ae f(a) dx). For bounded 
complex functions f = u+iv with u,v real, one says that f is Riemann integrable 
iff both u and v are Riemann integrable, and is fdv:= fe udx eile vdz. 
Proposition. If a bounded (complex) function on the real interval [a,b] is 
Riemann integrable, then it is Lebesgue integrable on [a,b], and its Lebesgue 


integral Sia. f dx coincides with its Riemann integral fe f dx. 


Proof. It suffices to consider bounded real functions f. Given a partition P, 
consider the simple Borel functions 


Ip = fla)lgay + >) Mele, rex)) UP = f(a)gay + >) Me len_scal: 
k k 


2.7. Product measure 71 
Then Ip < f < up on [a, }], and 


i Ipdx = Lp; / up dx = Up. (1) 
[a,b] [a,b] 


If f is Riemann integrable, there exists a sequence of partitions P; of [a,b] such 
that Pj+11 is a refinement of Pj, ||.P;|| = maxz,¢p,(%p—2p-1)  Oas j > oo, and 


b 
lim Lp, = lim Up, =} f da. (2) 
J J a 


The sequences |; := 1p, and uj := up, are monotonic (non-decreasing and non- 
increasing, respectively) and bounded. Let then / := lim; 1; and u := lim, u;. 
These are bounded Borel functions, and by (1), (2), and the Lebesgue dominated 
convergence theorem, 


b 
/ idx = lim l, dx =lim Lp, = / f dz, (3) 
[a,b] J a 


J J[a,b| 


and similarly Sia.) udx = ie f dz. In particular, Sia.) (u — 1) dx = 0, and since 
u—I > 0, it follows that u = / a.e.; however 1 < f < u, hence f=u=l 
a.e.; therefore, f is Lebesgue measurable (hence Lebesgue integrable, since it is 
bounded) and fi, fd = fig.y4dx = Jy f dx by (3). 


A similar proposition is valid for absolutely convergent improper Riemann 
integrals (on finite or infinite intervals). The easy proofs are omitted. 

Let Q = U; P; (a countable set, hence a Lebesgue null set). If x € a, }] is 
not in Q, f is continuous at x iff l(a) = u(x). It follows from the preceding proof 
that if f is Riemann integrable, then it is continuous at almost all points not in 
Q, that is, almost everywhere in [a,b]. Conversely, if f is continuous a.e., then 
l= f =ua.e., hence Sia.) Idx = Sia. udx. Therefore, given € > 0, there exists 
j such that Sia.) u; dx — Sia.t) l, dx <, that is, Up, — Lp, < ¢. This means that 
f is Riemann integrable on [a, b]. Formally: 


Proposition. Let f be a bounded complex function on [a,b]. Then f is Riemann 
integrable on [a,b] iff it is continuous almost everywhere in [a, b]. 


2.7 Product measure 


Let (X,A,) and (Y,B,v) be measure spaces. A measurable rectangle is a 
cartesian product A x B with A € A and B € B. The set C of all measurable 
rectangles is a semi-algebra, since 


(Ax B)N(C x D) =(ANC)x (BND) 


and 
(A x B)® = (AS x B)U(A x B®) U (AS x BY), 


72 2. Construction of measures 


where the union on the right is clearly disjoint. Define on C by 
A(A x B) = p(A)v(B). 


We claim that \ is a semi-measure on C (cf. Definition 2.2). Indeed, Property (1) 
is trivial, while Properties (2) and (3) follow from the stronger property: 
If A; x B; € C are mutually disjoint with union A x B € C, then 


\(A x B) = ys u(A;)v (Bi). (1) 


Proof. Let x € A. For each y € B, there exists a unique 7 such that the pair 
[x,y] belongs to A; x B; (because the rectangles are mutually disjoint). Thus, B 
decomposes as the disjoint union 


B= U B;. 


{i;wE Aj} 
Therefore, 
V(B)= > (Bi), 
{i;weE Ai} 
and so 


By Beppo Levi’s theorem (1.16), 


NA x B) = wAyy(B) = f v(B)Tale) ae 


By the Caratheodory extension theorem, there exists a complete measure 
space, which we denote 
(X x Y,Ax By x v), 


and call the product of the given measure spaces, such that C C A x B and 
(ux v)(A x B) = (A x B) := p(A)v(B) 


for Ax BEC. 

The central theorem of this section is the Fubini—Tonelli theorem, that relates 
the “double integral” (relative to ys x v) with the “iterated integrals” (relative to 
js and v in either order). We need first some technical lemmas. 


Lemma 2.14. For each E € Cas, the sections E, := {y € Y; [x,y] € E} (x € X) 
belong to B. 


2.7. Product measure 73 


Proof. If E = Ax B €C, then E, is either B (when z € A) or @ (otherwise), 
so clearly it belongs to B. If FE € C,, then E = Uj, E; with E; € C; hence, 


Similarly, if F € Co5, then FE = (),; E; with E; € C,, and therefore, 
Ez =(\(Ede €B 


a 


for alla Ee X. 


By the lemma, the function 
ge(x) :=v(E,) : X > [0,00] 
is well defined, for each E € C,5. 


Lemma 2.15. Suppose the measure space (X,A,,) is complete. For each 
E €Co5 with (x v)(E) < oo, the function gp(x) := v(E,) is A-measurable, 
and 


[ ge du = (ux (2), (2) 
Proof. For an arbitrary E=Ax BEC, 
ge =v(B)la 


is clearly A-measurable (since A € A), and (2) is trivially true. 

If E € C, (arbitrary), we may represent it as a disjoint union of E; € C 
(i € N), and therefore gg = )°; gn, is A-measurable, and by the Beppo Levi 
theorem and the o-additivity of p x v, 


| sedn= aye gp, du = (nx v)(Bi) = (x ¥)(B). 


Let now E € Cos with (uw x v)(E) < oo. Thus EF = [),F, with 


F, € Co. By Lemma 2.10, there exists G € C, such that EF C G and 
(ux v\(G) < (wx v)(E) +1 < oc. Then 


E=ENG=()\(inG) =()E£x, 
4 k 


where (for k = 1,2,...) 
k 


Ex = ( (Fin). 
i=1 
Since C, is an algebra, FE, € Co, Ex i, C Ex, and E, C G has finite product 
measure. Therefore gp, is A-measurable, and 


| JE, dpe = (ux v)(Ex) < 00 
x 


74 2. Construction of measures 


for all k. In particular gz, < 00 pi-a.e. 
For x such that gz, (x)(= v((E1)2)) < 00, we have by Lemma 1.11: 


g(t) = (Bs) = v( (Y(Bu)e) = lim o((E)«) = tin ge (2). 


k 


Hence gz, — ge pi-a.e. Since the measure space (X,A,y) is complete by 
hypothesis, it follows that gz is A-measurable. Also 0 < gr, < gz, for all k, and 
Jy 9, Iu < oo. Therefore, by Lebesgue’s dominated convergence theorem and 
Lemma 1.11, 


[9 du = lin | 96. dps = lim(y x v)(Ex) = (wx v)(E). 


We now extend this lemma to all E € Ax B with finite product measure. 


Lemma 2.16. Let (X,A,) and (Y,B,v) be complete measure spaces. Let E € 
A x B have finite product measure. Then the sections FE, are B-measurable 
for p-almost all x; the (y-a.e. defined and finite) function g(a) := v(E,) ts 
A-measurable, and 


i ge du = (ux v)(E). 
xX 


Proof. By Proposition 2.11, since EF has finite product measure, there exists 
F €Co5 such that EC F and (pu x v)(F) = (ux v)(E) < o. Let G:= F-E. 
Then G € A x B has zero product measure (since F and F have equal finite 
product measure). Again by Proposition 2.11, there exists H € C,5 such that 
G C Hand (uxv)(H) = 0. By Lemma 2.15, gy is A-measurable and f, gy du = 
(uu x v)(H) = 0. Therefore, v(H,) := g(x) = 0 p-a.e. Since G, C Hz, it follows 
from the completeness of the measure space (Y, 6, v) that, for p-almost all z, G, 
is B-measurable and v(G,) = 0. Since E = F — G, it follows that for p-almost 
all «, E, is B-measurable and v(E,) = v(F,), that is, gg = gr (pra.e.) is 
A-measurable (by Lemma 2.15), and 


[ wea f gr du = (ux v)(F) = (ux »)(B). 
x x 


Note that for any EC X x Y and we X, 


Therefore, if H € Ax B has finite product measure, then for p-almost all z, 
the function Ig(a,-) is B-measurable, with integral (over Y) equal to v(E,) := 
ge(x) < oo, that is, for u-almost all 2, 


Ip(ax,-) € L'(v), (i) 


2.7. Product measure 75 
and its integral (= gz) is A-measurable, with integral (over X) equal to (4 x v) 
(E) < ov, that is, 


/ In(o,-) dv € L(u), (ii) 
Y 


i [tele a= fae du= (u x v)(E) 


= | Ipd(u xv). (iii) 
XxxY 


and 


If f is a simple non-negative function in L1( x v), we may write f = >> cpln, 
(finite sum), with c, > 0 and (ux v)(E,) < oo. Then for p-almost all x, f(a,-) is 
a linear combination of L'(v)-functions (by (i)), and hence belongs to L'(v); its 
integral (over Y) is a linear combination of the gz, € L+(), and hence belongs 
to L(y), and by (iii), 


a [Le a) du = Se fy te d(uxv)= [fax 


If f € L'(u x v) is non-negative, by Theorem 1.8, we get simple measurable 
functions 
O<sy<so<::-<f 


such that lim s, = f. Necessarily, s, € L1(sxv), so by the preceding conclusions, 
for y-almost all 2, sp(a,-+) are B-measurable, and their integrals (over Y) are 
A-measurable; therefore, for -almost all x, f(x,-) is B-measurable, and by the 
monotone convergence theorem, 


[ fe jay =tim fay (v 


so that the integrals on the left are A-measurable. Applying the monotone 
convergence theorem to the sequence on the right, we have by (iii) for sp, 


ee 
ae 


(by ou application of the monotone convergence theorem). In particular, 
Jy f(@,:) dv € L*(1) and therefore, f(a,-) € L*(v) for p-almost all 2. 

For ; € L(x v) complex, decompose f = ut —u~ +ivt —iv~ to obtain the 
conclusions (i)—(iii) for f instead of Ig. Finally, we may interchange the roles of 
x and y. Collecting, we proved the following. 


Theorem 2.17 (Fubini’s theorem). Let (X,<A, 1) and (Y,B,v) be complete 
(positive) measure spaces, and let f € L'(u x v). Then 


76 2. Construction of measures 


(i) for p-almost all x, f(a,-) € hi ) and for v-almost ally, f(-,y) € L'(u); 
(ii) fy f(@,-)dv € L'(u) and fy f(-,y) due Liv); 
(iti) Sx ei ‘) dv) dp = yyy fal ux) = fylly fC) dul dv. 


When we need to verify the hypothesis f € L1(y x v) (i.e., the finiteness of 
the integral [,.,-|f| d(x v)), the following theorem on non-negative functions 
is useful. 


Theorem 2.18 (Tonelli’s theorem). Let (X,A, 1) and (Y,B,v) be complete 
o-finite measure spaces, and let f > 0 be Ax B-measurable. Then (i) and (ii) in 
Fubini’s theorem (2.17) are valid with the relation “E L\(---)” replaced by the 
expression “is measurable”, and (ii) is valid. 


Proof. The integrability of f > 0 was used in the preceding proof to guarantee 
that the measurable simple functions s, be in L1(y xv), that is, that they vanish 
outsides a measurable set of finite (product) measure, so that the preceding step, 
based on Lemma 2.16, could be applied. In our case, the product measure space 
Z= X xY is o-finite. Write Z = U,, Zn with Z, € Ax B of finite product 
measure and Z, C Zn+41. With s, as before, the “corrected” simple functions 


si := &,Iz, meet the said requirements. 


Exercises 


1. Calculate (with appropriate justification): 
(a) limp co fe(e~” /”)/(1 + 2?) de. 
(b) limy soy [7/? sin[(w/2)e~*”| cos x de. 
(c) fo Sov lyarctan(ay)]/[(1 + x?y?)(1 + y?)] dy de. 
2. Let L'(R) be the Lebesgue space with respect to the Lebesgue measure on 
R. If f € L1(R), define 


F(t) = f FE p05) as (u>0, tER). 


Prove: 


(a) For each u > 0, the function F,, : R > C is well defined, continuous, 
and bounded by ||f|[1. 


b) limy+oo Fy = 0 and limy,+04 Fy = s) ds pointwise. 
rf 


3. Let h : [0,00) > [0, 00) have a non-negative continuous derivative, h(0) = 0, 
and h(oo) = oo. Prove that 


a [7m (t)?) dtds = Vn/2. 


Exercises 77 


4. Let (X,A,) and (Y,B,v) be complete o-finite positive measure spaces, 
and p € [1,00) Consider the map 


[f,g] € L?(u) x L?(v) > F(a, y) := f(x)g(y). 
Prove: 


(a) Fe L?(u x v) and ||Fllz2qxv) = If llzeqoligllzec)- 
(b) The map [f,g] > F is continuous from L? (1) x L?(v) to L?(u x v). 


5. Let f : R? — C be Lebesgue measurable, such that |f(a,y)| < 
Me~* I_jei,Jai}(¥) on R?, for some constant M > 0. Prove: 


(a) f € L?(R*) for all p € [1,00), and || fl|z> a2) < M(2/p)"/?. 

(b) Suppose h : R > C is continuous and vanishes outside the interval 
[-1, 1]. Define f : R? > C by f(x,y) = ec” A(y/x) for « # 0 and 
f(0,y) =0. Then fy. f da dy = f°, h(t) dt. 

6. Let f :R* > R. Prove: 

(a) If f(z,-) is Borel for all real x and f(-,y) is continuous for all real y, 

then f is Borel on R?. 


(b) If f(x,-) is Lebesgue measurable for all x in some dense set E C R and 
f(-,y) is continuous for almost all y € R, then f is Lebesgue measurable 
on R?. 


Convolution and Fourier transform 


7. If E CR, denote 


E:= {(2,y) € R?;2-y € E} 

and : 
S:={ECR; FE € B(R’)}, 

where B(R*) is the Borel o-algebra on R*. Prove: 


(a) S is a o-algebra on R which contains the open sets (hence B(R) C S). 
(b) If f is a Borel function on R, then f(x — y) is a Borel function on R?. 
(c) If f,g are integrable Borel functions on R, then f(a — y)g(y) is an 


integrable Borel function on R* and its L1(R)-norm is equal to the 
product of the Z'(R) norms of f and g. 


(d) Let L'(R) and L1(R?) be the Lebesgue spaces for the Lebesgue measure 
spaces on R and R? respectively. If f,g € L’(R), then f(x — y)g(y) € 
L*(R’), 

Ilf(z — yon) = lIfllallgll, 


78 


(i) 
(j) 


2. Construction of measures 


and 
[ite- wat y)| dy < 00 (1) 


for almost all x. 
For x such that (1) holds, define 


(f a)(a )= [ He- aly (2) 


Show that the (almost everywhere defined and finite-valued) function 
f *g (called the convolution of f and g) is in L'(R), and 


If * alla < Wfllallglls. (3) 
For f € L1(R), define its Fourier transform Ff by 


4 | flaje**dz (t ER). (4) 
R 


Show that Ff : R > C is continuous, bounded by ||f||1, and F(f *g) = 
(F f)(FQ) for all f,g € L'(R). 


If f = Ia) for -co <a <b < &, then 


lim (Ff)(t) =0. (5) 


|tl 00 
Show that the step functions (i.e., finite linear combinations of 
indicators of disjoint intervals (az, b,]) are dense in C.(R) (the normed 
space of continuous complex functions on R with compact support, with 
pointwise operations and supremum norm), and hence also in L?(R) for 
any l<p<o. 
Prove (5) for any f € L1(R). (This is the Riemann—Lebesgue lemma.) 


Generalize the previous statements to functions on R*. 


. Let p € [1,00) and let q be its conjugate exponent. Let K : R? > C be 


Lebesgue measurable such that 


Kw) i= f 1K @wlaz € DR). 


Denote 


- | K(2x,y) f(y) dy 
R 


Prove: 


a) fe Ie K(x, y)f(y)| dy dx < |Kllqllfllp for all f € L?(R). Conclude that 


K(a,-)f € L1(R) for almost all x, and therefore Tf is well defined a.e. 
(when f € L?). 


Exercises 79 


(b) Tis a continuous (linear) map of L?(R) into L'(R), and ||T fl, < 
A lall fllp- 


9. Apply Fubini’s theorem to the function e~*¥ sina in order to prove the 


(Dirichlet) formula 
| sm dz =n /2. 
oc ee 


3 


Measure and topology 


In this chapter, the space X will be a topological space, and we are interested 
in constructing a measure space (X,M, ) with a “natural” affinity to the given 
topology. 

Denote by C.(X) the vector space of all complex-valued continuous functions 
with compact support on a locally compact Hausdorff space X. A linear 
functional ¢ on C,(X) is called positive if ¢(f) > 0 when f > 0. Measures provide 
a way to construct such functionals: indeed, if w is a positive Borel measure on 
X that is finite on compact subsets of X, then the formula ¢(f) := fy fdu 
defines a positive functional on C.(X). The central result of this chapter is the 
Riesz—Markov theorem, essentially saying that all positive functionals on C.(X) 
arise from positive measures this way. This result has a pivotal role in measure 
theory and functional analysis. In particular, it is the main ingredient in the 
proof of the Riesz representation theorem that we prove in Chapter 4. A simple 
application of the Riesz—-Markov theorem is an alternative construction of the 
Lebesgue measure on R*. 

We next prove a few results, including Lusin’s theorem, concerning 
approximating various types of functions by elements of C.(X). A short section 
on the support of a measure follows. 

The final section introduces differentiability of complex measures on R", and 
proves that every such measure ju is differentiable m-a.e. where m is the Lebesgue 
measure on R*, and moreover, the resulting “derivative” function Dy is the 
Radon—Nikodym derivative of the absolutely continuous part of 4 with respect 
to m. With some additional work this implies the (two parts of the) Fundamental 
Theorem of Calculus; the details are left to the reader in Exercise 4. 


3.1 Partition of unity 


We recall first some basic topological concepts. 


Introduction to Modern Analysis. Second Edition. Shmuel! Kantorovitz and Ami Viselter, Oxford University Press. 
© Shmuel Kantorovitz and Ami Viselter (2022). DOI: 10.1093/os0/9780192849540.003.0003 


82 3. Measure and topology 


A Hausdorff space (or T2-space) is a topological space (X,7) in which distinct 
points have disjoint open neighborhoods. A Hausdorff space X is locally compact 
if each point in X has a compact neighborhood. A Hausdorff space X can 
be imbedded (homeomorphically) as a dense subspace of a compact space Y 
(the Alexandroff one-point compactification of X), and Y is Hausdorff iff X is 
locally compact. In that case, Y is normal (as a compact Hausdorff space), and 
Urysohn’s lemma is valid in Y, that is, given disjoint closed sets A,B in Y, 
there exists a continuous function h : Y — [0,1] such that h(A) = {0} and 
h(B) = {1}. Theorem 3.1 “translates” this result to X. We need the following 
important concept: for any complex continuous function f on X, the support of 
f (denoted supp f) is defined as the closure of [f~1({0})]°. 


Theorem 3.1 (Urysohn’s lemma for locally compact Hausdorff space). 
Let X be a locally compact Hausdorff space, let U C X be open, and let K CU 
be compact. 

Then there exists a continuous function f : X — [0,1], with compact support 
such that supp f C U and f(K) = {1}. 


Proof. Let Y be the Alexandroff one-point compactification of X. The set U 
is open in X, hence in Y. The set K is compact in X, hence in Y, and is 
therefore closed in Y (since Y is Hausdorff). Since Y is normal, and the closed 
set K is contained in the open set U, there exists an open set V in Y such 
that K Cc V and cly(V) C U (where cly denotes the closure operator in Y). 
Therefore, K CVX :=W, W is open in X, and clx(W) =cly(V)NX CU. 

In the sequel, all closures are closures in X. 

Since X is locally compact, each x € K has an open neighborhood N, with 
compact closure. Then N, MW is an open neighborhood of x, with closure 
contained in cl(N,) Mcl(W), which is compact (since cl(V,) is compact), and is 
contained in U. By compactness of K, we obtain finitely many points 7; © K 
such that 


P 
KC U(Ne, NW) = N. 
i=1 

The open set N has closure equal to the union of the compact sets cl(N,,W), 
which is compact and contained in U. We proved that whenever K is a compact 
subset of the open set U in X, there exists an open set W with compact closure 
such that 

KcCWccl(W) CU. (1) 
(We wrote W instead of N.) 

The sets kK and Y — W are disjoint closed sets in Y. By Urysohn’s lemma 
for normal spaces, there exists a continuous function h : Y — [0,1] such that 
h=0on Y—W and h(K) = {1}. Let f := hlx (the restriction of h to X). Then 
f : X — [0,1] is continuous, f(/) = {1}, and since [f 4 0] C W, we have by (1) 


supp f Cc cl(W) CU. 


In particular, f has compact support (since supp f is a closed subset of the 
compact set cl(W)). 


3.1. Partition of unity 83 


Notation 3.2. We denote the space of all complex (real) continuous functions 
with compact support on the locally compact Hausdorff space X by C.(X) 
(CR(X), respectively). By Theorem 3.1, this is a non-trivial normed vector 
space over C (over R, respectively), with the uniform norm 


Il fllu == sup |f]. 
xX 
The positive cone 
CPX) = {f € COX); f 2 0} 
will play a central role in this section. 


Actually, Theorem 3.1 asserts that for any open set U 4 Q), the set 
QW) «= {f € CO(X); f < 1,supp f c U} (2) 
is ~ {0}. 


The following theorem generalizes Theorem 3.1 to the case of a finite open 
cover of the compact set K. Any set of functions {h,..., hn} with the properties 
described in Theorem 3.3 is called a partition of unity in C.(X) subordinate to 
the open cover {Vi,...,Vn} of the compact set K. 


Theorem 3.3. Let X be a locally compact Hausdorff space, let K C X be 
compact, and let Vi,...,Vn be open subsets of X such that 


KCVY,U::-UVp. 


Then there exist hy € Q(V;), 7=1,...,n, such that hy+---+hy, € O(YiU---UV,) 
and 
Ll=hyt+---+h, onk. 


Proof. For each x € K, there exists an index i(a) (between 1 and n) such that 
x € Vyq). By (1) (applied to the compact set {2} contained in the open set 
Vig) there exists an open set W, with compact closure such that 


ze W, Ccl(Wz) C Via). (3) 
By the compactness of K, there exist 21,...,%m © K such that 
Kk CW,, U---UWg,,. 
Define for each 1 = 1,...,n 
H, := ({cl(W2,);el(We,) C Vi}- 


As a finite union of compact sets, H; is compact, and contained in V;. By 
Theorem 3.1, there exist f; € Q(V;), such that f; = 1 on H;. Take hy = fi 
and for k = 2,...,n, consider the continuous functions 


k-1 


hy = fe [J - fi). 


i=l 


84 3. Measure and topology 


An immediate induction on k (up to n) shows that 


k 
hy t+. the =1-]][Q- fi). 

i=1 
Since f; = 1 on Hj, the product []/_,(1 — fi) vanishes on U;_, Hi, and this 
union contains the union of the W,,,7 =1,...,m, hence contains kK. Therefore, 
hy +--+ +hp = 1 0n K. The support of hy is contained in the support of fy, 
hence in Vz, and 0 < hy < 1 trivially. The support of hy + ---+ hy is thus 
contained in V; U---UV,. Moreover, since hy +--- +h, =1—J]j_,(1— fi) and 
0 < f; <1 for all 7, we also have O< hy +--+ +hy <1. 


3.2 Positive linear functionals 


Definition 3.4. A linear functional ¢ : C.(X) — C is said to be positive if 
o(f) > 0 for all f € CF(X). 

This is clearly equivalent to the monotonicity condition: ¢(f) > ¢(g) 
whenever f > g (f,g € CR(X)). 


Let V € r. The indicator Iy is continuous if and only if V is also closed. 
In that case, Ivy € Q(V), and f < Jy for all f € Q(V). By monotonicity of 4, 
0 < O(f) < @Uv) for all f € Q(V), and therefore 0 < supregyy) O(f) < Uv). 
Since Iy € Q(V), we actually have an identity (in our special case). The set 
function ¢(Jy) is a “natural” candidate for a measure of V (associated to the 
given functional). Since the supremum expression makes sense for arbitrary open 
sets, we take it as the definition of our “measure” (first defined on 7). 


Definition 3.5. Let (X,7) be a locally compact Hausdorff space, and let ¢ be 
a positive linear functional on C,(X). Set 


whenever f € Q(V). 


Lemma 3.6. is non-negative, monotonic, and subadditive (on 7), and 
1(0) = 0. 
Proof. For each f € Q(V) Cc Ct(X), (f) = 0, and therefore w(V = 0 (for all 
V € 7). Since 2(0) = {0} and ¢(0) = 0, we trivially have u(0) = 0. EV CW 
(with V,W € 7), then Q(V) C Q(W), so that w(V) < w(W). 

Next, let Vi; € 7,7 = 1,...,n, with union V. Fix f € Q(V). Let then 
hy,...,hy be a partition of unity in C.(X) subordinate to the open covering 
Vi,..-,Vn of the compact set supp f. Then 


f= hf mf eM), 4=1,...5n. 


i=l 


3.2. Positive linear functionals 85 


Therefore 


= Los <a 


Taking the supremum over all f € as we oe 


We now extend the definition of 4 to P(X). 
Definition 3.7. For any E € P(X), we set 


w(E) = inf u(V). 


If E € 7, then EF C E € 7, so that w*(E£) < u(E). On the other hand, 
whenever EF CV € 7, pw(E) < p(V) (by Lemma 3.6), so that u(E) < p*(E). 
Thus p* = p on T, and p* is indeed an extension of pu. 


Lemma 3.8. ju* ts an outer measure. 


Proof. First, since  € 7, u*(0) = u(0) = 0. 
If ECF CX, then E Cc V €7 whenever F' Cc V €7, and therefore 


*(E):= inf < inf = W(F 
w(E) = inf WV) Sint WV) = we (F), 


proving the monotonicity of p*. 

Let {£;} be any sequence of subsets of X. If y*(£;) = co for some i, then 
Yo, (Ei) = 00 > w*(U; &:) trivially. Assume therefore that py*(E;) < oo for 
alli. Let € > 0. By Definition 3.7, there exist open sets V; such that 


E,c Vi; p(Vi) < w*(E;) + €/2', (bere 


Let FE and V be the unions of the sets E; and V;, respectively. If f € Q(V), 
then {V;} is an open cover of the compact set supp f, and there exists therefore 
n €N such that supp f CV, U---UV,, that is, fe (Vi U---UV,,). Hence, by 
Lemma 3.6, 


O(f) SMVLU-+-U Vn) Sw) +--+ + Kn) < Se(B) +e 
Taking the supremum over all f € O(V), we get 
V) <> ow (Bi) +e 
Since EC V € 7, we get by definition 


u(B) < WV) < So p(B) +e, 


a 


86 3. Measure and topology 


and the o-subadditivity of * follows from the arbitrariness of e. 


At this stage, we could appeal to Caratheodory’s theory (cf. Chapter 2) to 
obtain the wanted measure space. We prefer, however, to give a construction 
independent of Chapter 2, and strongly linked to the topology of the space. 

Denote by K the family of all compact subsets of X. 


Lemma 3.9. p* is finite and additive on K. 


Proof. Let K € K. By Theorem 3.1, there exists f € C.(X) such that 0 < 
f < land f =10n K. Let V := [f > 1/2]. Then K C V € 7, so that 
po (EK) < p(V). On the other hand, for all h € Q(V), h < 1 < 2f on V, so 
that w(V) := suppeqiv) O(h) < O(2f) (by monotonicity of ¢), and therefore 
u*(K) < ¢(2f) < oc. 

Next, let K; € K(t = 1,2) be disjoint, and let « > 0. Then Ky Cc KS € 7. By 
(1) in the proof of Theorem 3.1, there exists Vi open with compact closure such 
that Ky Cc V; and cl(Vi) C KS. Hence K2 C [cl(Vi)]© := Vo € 7, and V2 C V¥. 
Thus, V; are disjoint open sets containing K; (respectively). 

Since Ky U Ko is compact, p*(Ky U K2) < oo, and therefore, by definition 
of y*, there exists an open set W such that Ky, U K2 C W and 


w(W) <u"(KyU Ke) +6 (1) 
By definition of 4: on open sets, there exist f; € Q(W 1M V;) such that 
MWOVi) <o(fi)+e, t= 1,2. (2) 
Since K; CWO V;,i =1, 2, it follows from (2) that 
uw (By) +" (Ke) < WAV) +uE(WV2) < O(f1) + O(f2) +2€ = O(fi t+ fa) +2¢. 
However, fi + fe € Q(W). Hence, by (1), 
uw" (K1) + w(K) < w(W) + 2€ < "(Ky U Ke) + 3¢. 


The arbitrariness of ¢ and the subadditivity of * give y* (Ay U K2) = w* (1) + 
u* (Ko). 


Definition 3.10. The inner measure of E € P(X) is defined by 


Mx(E):= sup p(k). 
{KEK;KCE} 


By monotonicity of u*, we have yp, < u*, and equality (and finiteness) is 
valid on K (cf. Lemma 3.9). We consider then the family 


Mo := {E € P(X); us(E) = w"(E) < oo}. (3) 


We just observed that 
KcMo. (4) 


3.2. Positive linear functionals 87 


Another important subfamily of Mo consists of the open sets of finite measure pL. 
Lemma 3.11. 7 := {V €7;p(V) < cof C Mo. 


Proof. Let V € 7 and € > 0. Since p(V) < ov, it follows from the definition 
of that there exists f € Q(V) such that u(V) —e« < d(f). Let K := supp f. 
Whenever K Cc W € 7, we have necessarily f € Q(W), and therefore ¢(f) < 
p(W). Hence 


WV) —e< o(f) < gee 


= w(K) <u (V) Sw (V) = Hw), 
and so p.(V) = u(V)(= p*(V)) by the arbitrariness of e. 


Lemma 3.12. p* is o-additive on Mo, that ts, for any sequence of mutually 
disjoint sets E; © Mo with union E, 


w(E) = do (Ei). (5) 


Furthermore, E € Mg if u*(E) < oo. In particular, Mo is closed under finite 
disjoint unions. 


Proof. By Lemma 3.8, it suffices to prove the inequality > in (5). Since this 
inequality is trivial when u*(E£) = co, we may assume that p*(E) < co. 
Let € > 0 be given. For all 2 = 1,2,..., since E; € Mo, there exist H; € K 
such that H; C E; and 
uw" (Ei) < uw" (Ai) + €/2°. (6) 


The sets H; are necessarily mutually disjoint. Define 


omnes: n=1,2,.... 


i=l 
Then K, C FE, and by Lemma 3.9 and (6), 


nm n 


Sou (Ei) < So (Ai) +e =o (Kn) +e < h(E) +. 


i=l i=l 


The arbitrariness of € proves the wanted inequality > in (5). Now yu*(E) is the 
finite sum of the series }> u*(E;); hence, given « > 0, we may choose n € N 
such that p*(E) < S07, w*(E;) +e. For that n, if the compact set K,, is defined 
as before, we get p*(E) < y*(K,,) + 2e, and therefore u*(E£) = p,(£), that is, 
EEMo. 


Lemma 3.13. Mo is a ring of subsets of X, that is, it is closed under the 
operations U,M, — between sets. Furthermore, if FE € Mo, then for each € > 0, 
there exist K € K and V € 7 such that 


KCECYV; wV—-K)<e. (7) 


88 3. Measure and topology 


Proof. Note that V — K is open, so that (7) makes sense. We prove it first. By 
definition of * and Mo, there exist V € 7 and kK € K such that kK CE CV 
and 

WV) — €/2 < p(B) < p* (Kh) + €/2. 


In particular, u(V) < oo and wp(V — K) < w(V) < «, so that V,V — K € 7. By 
Lemma 3.11, V—K € Mo. Since also K € K C Mo, it follows from Lemma 3.12 
that 

w(K) + (V— BK) = pV) <p (Kh) +e. 


Since p*(K) < 00, we obtain p*(V — K) <e. 
Now, let E; € Mo,i = 1,2. Given € > 0, pick K;,V; as in (7). Since 
Ey, — Ey CV, — Ko Cc (Vi — Ki) U (1 — Vo) U (V2 — Ka), 
and the sets on the right are disjoint sets in Mg, we have by Lemma 3.12, 
pi (Ey, — Ea) < p* (Ky — V2) + 2e. 


Since Ky — V2 (= Ki NV) is a compact subset of Ey — E2, it follows that 
po’ (Ey — Ez) < ps(E1 — Ep) + 2e (and of course pu* (Ey — E2) < p*(E1) < 00), so 
that Ey — Ey € Mo. 

Now £1 U Ey = (Fy — Ep) U Ey € Mo as the disjoint union of sets in Mo 
(cf. Lemma 3.12), and £19 Fy = E, — (FE, — Ez) € Mo since Mo is closed under 
difference. 


Definition 3.14. 
M={EEP(X);ENK € Mo for all K € K}. 


If Eis a closed set, then EN K € K C Mo for all K € K, so that M contains 
all closed sets. 


Lemma 3.15. M is a a-algebra containing the Borel algebra B (of X), and 
Mo = {E € Msu"(E) < oo}. (8) 
Furthermore, the restriction := p*|M is a measure. 


Proof. We first prove (8). If E € Mo, then since K C Mo, we surely have 
EN K € Mo for all K € K (by Lemma 3.13), so that E € M (and of course 
pu“ (E) < co by definition). 

On the other hand, suppose F € M and y*(E) < oo. Let € > 0. By definition, 
there exists V € 7 such that E C V and p(V) < p*(E)+1 < co. By Lemma 3.11, 
V € Mo. Applying Lemma 3.13 (7) to V, we obtain a set K € K such that 
K CV and p*(V—K) <e. Since ENK € Mo (by definition of M), there exists 
Hf €XK such that HC ENK and 


w(EOK) < p* (A) +e. 


3.2. Positive linear functionals 89 


Now E Cc (EN K)U(V —&), so that by Lemma 3.8, 
p(B) Su(ENK) +u"(V —K) < ut(H) +2€ S pa(E) +26. 


The arbitrariness of € implies that u*(E) < y.(£), so that E € Mo, and (8) is 
proved. 

Since M contains all closed sets (see observation following Definition 3.14), 
we may conclude that B C M once we know that M is a o-algebra. 

If E € M, then for all K EK, 


ESOK=K-(ENK)€Mo 


by definition and Lemma 3.13. Hence E* € M. 
Let BE; € M,i=1,2,..., with union F. Then for each K € K, 


EnK=(JEnK=(JA, 


where 
F, = (E, K) - (#0 k) 
j<i 
are mutually disjoint sets in Mo (by definition of M and Lemma 3.13). Since 
(EO BK) < p*(K) < oo (by Lemma 3.9), it follows from Lemma 3.12 that 
EQ K € Mo, and we conclude that EF € M. 

Finally, let E; € M be mutually disjoint with union EF. If u*(£;) = co for 
some 7, then also p*(E£) = co by monotonicity, and y*(E) = >>, u*(£;) trivially. 
Suppose then that u*(E;) < co for all i. By (8), it follows that E; € Mo for all i, 
and the wanted o-additivity of u = ~*|4 follows from Lemma 3.12. 


We call (X,M,) the measure space associated with the positive linear 
functional ¢. Integration in the following discussion is performed over this 
measure space. 


Lemma 3.16. For all f € CT(X), 
< f dp. 
o(f) i bl 


Proof. Fix f € C*(X), let K be its (compact) support, and let 0 < a < b be 
such that [a,b] contains the (compact) range of f. Given € > 0, choose points 


O< ySa<y<-+<yr=b 
such that yx — yr—1 < €, and set 
Fi :=|yo <f <mjak; Eee= [yg < fF <= ye, k=2,...,n. 


Since f is continuous with support K, the sets E;, are disjoint Borel sets with 
union K. By definition of our measure space, there exist open sets V; such that 


Ex CVE; b(Ve) < w(Ex) +e/n 


90 3. Measure and topology 


for k = 1,...,n. Since f < yx on Ex, it follows from the continuity of f that 
there exist open sets U; such that 


Ex C Ur; ff < ye +eon Uz. 
Taking W; := V, Ux, we have for allk =1,...,n 
Ex CW; pw(Wr) < w(Ex)+e/n; ff < ye te on Wy. 


Let {hy3k = 1,...,n} be a partition of unity in C.(X) subordinate to the 
open covering {W,;k =1,...,n} of K. Then 


Fi = Sieh 
k=1 


and for allk =1,...,n 
he f < he(ye + €) 


(since hy € Q(W;,) and f < yx +€ on W,), 
d(he) < w(We) 
(since hy € Q(W;,)), and 


Ye = Yr—-1 + (Ye — Yr-1) < f +e on Ex. 


Therefore, 
= So o(hef) < do (yn + )o(he) < Youn + ul 
k=1 k k 
< Dilur + lal (Ex) + €/n] < SC yeu(E x) tewlK) + So (ye + ee/n 


k k 


cE] Usndneters Oroen f saes buh) obe4 


Since w(K) < co, the lemma follows from the arbitrariness of e. 


=f saw. 


Proof. By linearity, it suffices to prove the lemma for real f € C.(X). Given 
such f, let K be its (compact) support, and let M = sup|f|. For any « > 0, 
choose V open such that kK C V and w(V) < p(K) +; then choose h € Q(V) 
such that w(V) < ¢(h) +. By Urysohn’s lemma, there is a function k € Q(V) 
such that k = 1 on K. Let g = max{h,k} (= (1/2)(h +k + |h— kl) € CT(X)). 


Lemma 3.17. For all f € C.(X), 


3.3. The Riesz—Markov representation theorem 91 


Then g € Q(V), g =10n K, and p(V) < o(g) +. Define F = f + Mg. Then 
Fe Ct(X) and F = f+M on K. By Lemma 3.16, 


o(F) < | F dp, 
x 
that is, since g € Q(V), 


o(t)+ Mot) < f tan+M f ads | fan+ Mav) 


< f fau+ MI6(9) +4. 
x 
Hence, by the arbitrariness of e, 
olf) s | fay 
xX 
for all real f € C.(X). Replacing f by —f, we also have 


-s(f)=4-f) < i epase a fay, 


so that ¢(f) = fy f du. 


3.3. The Riesz—Markov representation theorem 


Theorem 3.18 (Riesz—Markov). Let (X,7) be a locally compact Hausdorff 
space, and let d be a positive linear functional on C.(X). Let (X,M, py) be the 
measure space associated with @. Then 


ot)= f fan fec(x, (*) 


In addition, the following properties are valid: 
(1) B(X) CM. 
(2) p is finite on K (the compact subsets of X ). 
(3) p(B) = infgcver WV) for all Ee M. 
(4) WE) = suprrec:xcr} H(K) (i) for all E € 7, and (i) for all Ee M 


with finite measure. 
(5) the measure space (X,M, 1) is complete. 
Furthermore, the measure fu is uniquely determined on M by (*), (2), (3), 
and (4)-(i). 


Proof. Properties (*), (1), (2), and (4)-(ii) are valid by Lemma 3.17, 3.15, 3.9, 
and 3.15 (together with Definition 3.10 and the following notation (3)), 
respectively. Property (3) follows from Definition 3.7, since p := p*|\4. 


92 3. Measure and topology 


If E € M has measure zero, and F' Cc F, then y*(F) = 0 and p*(K) = 0 for 
all K € K,K C F (by monotonicity), so that u,(F') = 0 = u*(F) < ~, that is, 
Fe Mo Cc M, and (5) is proved. 

We prove (4)-(i). Let V € 7. If w(V) < oo, then V € Mo by Lemma 3.11, 
and (4)-(i) follows from the definition of Mo. Assume then that u(V) = co. By 
Definition 3.5, for each n € N, there exists f, € Q(V) such that o(f,) > n. Let 
Ky, := supp(f,). Then for all n, 


pa(V) > (Kn) > [ fn du = b(fn) > 1, 


so that pu.(V) = co = p(V), and (4)-(i) is valid for V. 

Suppose v is any positive measure on M satisfying Properties (*), (2), (3), 
and (4)-(i). Let « > 0 and K € K. By (2) and (3), there exists V € 7 such that 
K CV and v(V) < v(K) +. By Urysohn’s lemma (3.1), there exists f € O(V) 
such that f = lon K. Hence Ix < f < Ivy, and therefore, by (*) for both pu and »y, 


WK) =f tedus f tan=o()= f tars [ vav=vV) <v\k) +6 


Hence p(k) < v(), and so p(K) = v(ic) by symmetry. By (4)-(i), it follows 
that 4 =v on T, hence on M, by (3). 


In case X is o-compact, the following additional structural properties are 
valid for the measure space associated with @. 


Theorem 3.19. Let X be a Hausdorff, locally compact, a-compact space, and 
let (X,M, py) be the measure space associated with the positive linear functional 
g on C.(X). Then: 


(1) For all E€ M and e > 0, there exist F closed and V open such that 
FCECV; pwV—-F)<e. 


(2) Properties (3) and (4) in Theorem 3.18 are valid for all E € M (this fact 
is formulated by the expression: p is regular. One says also that p\g(x) 4s 
a regular Borel measure). 


(3) For all E € M, there exist an Fz set A and a Gs set B such that 
ACECB; p(B-A)=0 
(i.e., every set in M is the union of an F, set and a null set). 


Proof. The o-compactness hypothesis means that X =U), K; with kK; compact. 
Let « > 0 and E € M. By 3.18(2), w(K; NE) < w(K;) < co, and therefore, by 
3.18(3), there exist open sets V; such that 


K,NECV; p(Vi-(KiN E)) < €/2**?, $M Dees 


3.4. Lusin’s theorem 93 
Set V =U, Vi. Then V is open, contains E, and 


WV —E) <p (Ue (Kin e)) < €/2. 
Replacing E by E*%, we obtain in the same fashion an open set W containing E° 
such that u(W — E°) < €/2. Setting F := W°, we obtain a closed set contained 
in E such that p(E£ — F) < €/2, and (1) follows. 

Next, for an arbitrary closed set F', we have F = U,(Ki NF). Let Hy, = 
Us_, kK; F. Then H,, is compact for each n, H, C F, and p(H,) > p(F). 
Therefore, Property (4) in Theorem 3.18 is valid for closed sets. If E € M, the 
first part of the proof gives us a closed subset F' of E such that u(E — F) < 1. 
If (E) = ov, also p(F’) = 00, and therefore 

sup w(K) > sup w(K) = w(F’) = wo = p(E). 
{KEK;K CE} {KEK;KCF} 
Together with (3) and (4)-(ii) in Theorem 3.18, this means that Properties (3) 
and (4) in 3.18 are valid for all E € M. 

Finally, for any E € M, take « = 1/n (n = 1,2,...) in (1); this gives us 

closed sets F;, and open sets V,, such that 


P,CECWVn; bVa- Fn) <1/n, n=1,2,.... 


Set A= UF, and B=()|V,. Then A € F,, BE Gs, AC EC B, and since 
B-ACYV, — Fh, we have (B— A) < 1/n for all n, so that p(B — A) = 0. 


3.4 Lusin’s theorem 


For the measure space of Theorem 3.18, the relation between M-measurable 
functions and continuous functions is described in the following. 


Theorem 3.20 (Lusin). Let X be a locally compact Hausdorff space, and let 
(X,M, 1) be a measure space such that B(X) C M and Properties (2), (3), 
and (4)-(ti) of Theorem 3.18 are satisfied. Let A € M, p(A) < ~, and let 
f:X > C be measurable and vanish on A°. Then, for any € > 0, there exists 
g € C.(X) such that w([f 4 gl) < ¢. In case f is bounded, one may choose g 
such that |lgllu < |lfllu- 


Proof. Suppose the theorem proved for bounded functions f (satisfying the 
hypothesis of the theorem). For an arbitrary f (as in the theorem), the sets 
E, := ||f| > n], n =1,2,... form a decreasing sequence of measurable subsets 
of A. Since (A) < ov, it follows from Lemma 1.11 that lim p(E,) = u(1) En) = 
u(0) = 0. Therefore, we may choose n such that p(E,) < €/2. The function 
fn := fle satisfies the hypothesis of the theorem and is also bounded (by n). By 
our assumption, there exists g € C.(X) such that p([g 4 fn]) < €/2. Therefore, 


(lg A fl) = wg Af] En) + wg A fn] OER) 
< w(En) + w(lg # fal) <. 


94 3. Measure and topology 


Next, we may restrict our attention to non-negative functions f as above. 
Indeed, in the general case, we may write f = ‘ee i*u,, with up non-negative, 
measurable, bounded, and vanishing on A‘. By the special case we assumed, 
there exist gx € C.(X) such that pu([gx 4 ux]) < €/4. Let E = Uf_olgn A us! 
and g := ae i*g,. Then g € C.(X), and since [g 4 f] C E, we have indeed 
u(lg # f]) <e. 

Let then 0 < f < M satisfy the hypothesis of the theorem. Replacing f by 
f/M, we may assume that 0 < f <1. 

Since (A) < oo, Property (4)-(ii) gives us a compact set K C A such that 
(A — K) < €/2. Suppose the theorem is true for A compact. The function 
fx := fIx is measurable with range in [0,1) and vanishes outside K. By the 
theorem for compact A, there exists g € C.(X) such that p([g 4 fj) < €/2. 
Then 


Mg 4 fl) = wlg A fe] (KU AS) + w(lg A FIN KEN A) 
<ulg 4 fr) + wWA-K) <e. 


It remains to prove the theorem for f measurable with range in [0,1), that 
vanishes on the complement of a compact set A. 
By Theorem 1.8, there exist measurable simple functions 


O<dism<s-<f 


such that f = lim ¢,. Therefore, f = 5°, %n, where 1 = ¢1, Yn = bn—On-1 = 
2-"Iz, (for n > 1), and E, are measurable subsets of A (so that u(E,) < 00). 
Since A is a compact subset of the locally compact Hausdorff space X, there 
exists an open set V with compact closure such that A C V (cf. (1) in the 
proof of Theorem 3.1). By Properties (3) and (4)-(ii) of the measure space (since 
u(E,) < oo), there exist K,, compact and V,, open such that 


Kn C Ey C Vn CY, 
and 
W(Vn — Kn) <€/2", n=1,2,.... 


By Urysohn’s lemma (3.1), there exist h, € C.(X) such that 0 < hy, <1, hyn =1 
on K,, and hy, = 0 on V,°. Set 


g= SS Qe hes 


The series is majorized by the convergent series of constants }> 27”, hence 
converges uniformly on X; therefore g is continuous. For all n, V, C cl(V) 
and g vanishes on the set (|V,° = (UV,,)°, which contains (cl(V))°; thus the 
support of g is contained in the compact set cl(V), and so g € C,(X). Since 
2-"hn = Wy on Ky, UV, we have 


[94 A CU2 hn A nl C | J(Va - Kn), 


3.4, Lusin’s theorem 95 


and therefore 


(lg # Ff) < do €/2” =e. 


We show finally how to “correct” g so that ||gllu < || ||, when f is a bounded 
function satisfying the hypothesis of the theorem. Suppose g € C.(X) is such 
that u([g # f]) < €. Let E = [lg] < || flu]. Define 


91 = gle + (9/l9I)Flluteze- 


Then g; is continuous (!), |lgillu < ||fllu, and since gi(x) = 0 iff g(x) = 0, m1 
has compact support. Since [g = f] C E, we have [g = f] C [g, = f], hence 
Mg # fl) < wg 4 fl) <e 


Corollary 3.21. Let (X,M, 1) be a measure space as in Theorem 3.20. Then 
for each p € [1, 00), C.(X) is dense in L?(p). 


In the terminology of Definition 1.28, Corollary 3.21 establishes that L?() 
is the completion of C,(X) in the || - ||p-metric. 


Proof. Since B(X) Cc M, Borel functions are M-measurable; in particular, 
continuous functions are M-measurable. If f € C.(X) and Kk := supp f, then 
Sx IFIP du < || fI/RuCUk) < co by Property (2). Thus C.(X) C L?(u) for all 
p € [1,co). By Theorem 1.27, it suffices to prove that for each simple measurable 
function ¢ vanishing outside a measurable set A of finite measure and for each 
€ > 0, there exists g € C.(X) such that ||6—g||p < e. By Theorem 3.20 applied to 
6, there exists 9 € Col) such that a((9# a) < (€/(2li))? 204 lla $ la 
Then 


lo —gl2 = i lool dp < Pholalra(( # a) <@, 


as wanted. 


By Lemma 1.30, we obtain 


Corollary 3.22. Let (X,M,,) be a measure space as in Theorem 3.20. Let 
f € L?(u) for some p € [1,00). Then there exists a sequence {g,} C C.(X) that 
converges to f almost everywhere. 


In view of the observation following the statement of Corollary 3.21, it is 
interesting to find the completion of C.(X) with respect to the || - ||,-metric. We 
start with a definition. 


Definition 3.23. Let X be a locally compact Hausdorff space. Then Co(X) will 
denote the space of all complex continuous functions f on X with the following 


property: 

(*) for each € > 0, there exists a compact subset K C X such that |f| < € 
on I. 

A function with Property (*) is said to vanish at infinity. 


96 3. Measure and topology 


Under pointwise operations, Co(X) is a complex vector space that contains 
C.(X). If f € Co(X) and K is as in (*) with « = 1, then || f||,, < sup, |f|+1 < «~, 


and it follows that Co(X) is a normed space for the uniform norm. 
Theorem 3.24. Co(X) is the completion of C.(X). 


Proof. Let {fn} C Co(X) be Cauchy. Then f := lim f, exists pointwise 
uniformly on X, so that f is continuous on X and ||fn — f|lu — 0. Given 
€ > 0, let no € N be such that || f, — fllu < €/2 for all n > no. Fix n > no anda 
compact set AK such that |fn| < €/2 on K° (cf. (*)). Then |f| < |f—fnl+|fnl < € 
on K°, so that f € Co(X), and we conclude that Co(X) is complete. 

Given f € Co(X) and € > 0, let K be as in (*). By Urysohn’s Lemma (3.1), 
there exists h € C.(X) such that 0 < h < 1 on X and h = 1 on K. Then 
hf € C.(X), |f —hf| = (1—h)|f| =0 on K, and |f —hf| < |f| < «on K°, so 
that || f —hf\lu <¢. This shows that C,(X) is dense in Co(X). 


Example 3.25. Consider the special case X = R*, the k-dimensional Euclidean 
space. If f € C.(R*) and T is any closed cell containing supp f, let ¢(f) 
be the Riemann integral of f on T. Then ¢ is a well-defined positive linear 
functional on C,(R*). Let (R*,M,m) be the associated measure space as in 
Theorem 3.18. Then, by Theorem 3.18, the integral Fick f dm coincides with the 
Riemann integral of f for all f € C.(R*). 

For n € N large enough and a < 5 real, let fra» : R — [0,1] denote the 
function equal to zero outside [a+ 1/n,b—1/n], to 1 in [a + 2/n, b — 2/n], and 
linear elsewhere. Then 


b 
i) fn,a,b EL a b—a—3/n. 


If T = {x € R*;a; < a; < bi = 1,...,k}, consider the function Fy,r = 
Ties fna:.b, € Ce(R*). Then 


Fur < Iv < FnTy 5 
where T;, = {x € R*;a; — 2/n < a; <b; + 2/n}. Therefore, by Fubini’s theorem 
for the Riemann integral on cells, 


k k 


[[@: —a;,—3/n) = [Fer da ,... dx, < m(T) < [Fon = [[@ —a;+1/n). 


i=1 i=1 
Letting n — oo, we conclude that 


k 


m(T) = ][ (0; — ai) = vol(T). 


i=l 


By Theorem 2.13 and the subsequent constructions of Lebesgue’s measure on R 
and of the product measure, the measure m coincides with Lebesgue’s measure 
on the Borel subsets of R*. 


3.6. Measures on R*; differentiability 97 


3.5 The support of a measure 


Definition 3.26. Let (X,M, 1) be as in Theorem 3.18. Let V be the union of 
all the open p-null sets in X. The support of ps is the complement V° of V, and 
is denoted by supp Lu. 


Since V is open, supp py is a closed subset of X. Also, by Property (4)-(i) of pu 

(cf. Theorem 3.18), 
wWV)= sup p(k). (1) 
KEK: KCV 

If kK is a compact subset of V, the open p-null sets are an open cover of K, and 
there exist therefore finitely many p-null sets that cover K; hence p(k) = 0, 
and it follows from (1) that u(V) = 0. Thus S = supp yp is the smallest closed 
set with a u-null complement. 

For any f € L1(), we have 


[sas [paw (2) 


If f € C.(X) is non-negative and f,, f du = 0, then f = 0 identically on the 
support S of jz. Indeed, suppose there exists xo € S such that f(a) 4 0. Then 
there exists an open neighborhood U of xo such that f 40 on U. Let K be any 
compact subset of U. Then c:= ming f > 0, and 


o=f fae [ fay > auth. 


Hence p(k) = 0, and therefore u(U) = 0 by Property (4)-(i) of pw (ef. 
Theorem 3.18). Thus, U Cc S*, which implies the contradiction xo € S°. 

Together with (2), this shows that J, f du = 0 for a non-negative function 
f € C.(X) iff f vanishes identically on supp yp. 


3.6 Measures on R*; differentiability 


Notation 3.27. If E C R*, we denote the diameter of E (ie., sup, yer d(x, y)) 
by 6(E). Let yu be a real or a positive Borel measure on R", and let m denote the 
Lebesgue measure on R*. Fix x € R*, and consider the quotients u(E)/m(E) 
for all open cubes E' containing x. The upper derivative of ts at x is defined by 


(Dp) (zx) ion lim sup iar = li sup ME) 


5(E)30 ME) 50 sm) ep M(B) 


The lower derivative of p at x, denoted (Dy)(«), is defined similarly by replacing 
lim sup and sup by liminf and inf, respectively. 


Since sups(z) <r H(E)/m(E) is an increasing function of r, (D Du) (a Ie is well 
defined. The — is true of (Du)(x), and we have trivially (Du)(x) < (Dyu)(«). 


98 3. Measure and topology 


In case these quantities are equal and finite, one says that p is differentiable at x; 
the common value is denoted (Dy:)(a), and is called the derivative of 1 at x. 

If f(t) := supper s(ey<r M(E)/m(E) > c for some real ¢ and some 
r > 0, there exists an open cube Eo containing x with 6(£o) < r such that 
pu(E9)/m(Eo) > c; this inequality is true for all y € Eo, and therefore, for each 
y € Eo, the shown supremum over all open cubes EF containing y with 6(E) <r 
is > c. This shows that [f > c] is open, and therefore f is a Borel function of «. 
Consequently, Du is a Borel function. 

If up (k = 1,2) are real Borel measures with finite upper derivatives at x, 
then 


D(t1 + 2) < Dua + Dus 


at every point x; for D, the inequality is reversed. It follows in particular that if 
both jz are differentiable at x, the same is true of f := f41 + fe, and (Du)(x) = 
(Dy) (x) + (Du2)(2). 

The concepts of differentiability and derivative are extended to complex 
measures in the usual way. 

The next theorem relates Du to the Radon—Nikodym derivative du,/dm of 
the absolutely continuous part jug of in its Lebesgue decomposition with respect 
to m (cf. Theorem 1.45). 


Theorem 3.28. Let p: be a complex Borel measure onR*. Then p is differentiable 
m-a.e., and Du = djta/dm (as elements of L'(R*)). 


It follows in particular that uw L m iff Du = 0 m-a.e., and wp <« m iff 
u(E) = f,(Dy)dm for all E € B := B(R*). 


Proof. 1. Consider first a positive Borel measure pp which is finite on compact 
sets. 
Fix A € Band c> 0, and assume that the Borel set 


Ac := AN[Du> cd (1) 


(cf. Notation 3.27) has positive Lebesgue measure. 

Since m is regular, there exists a compact set K C A, such that m(K) > 0. 
Fix r > 0. For each x € K, there exists an open cube EF with 6(£) < r such that 
x € E and p(E)/m(E) > c. By compactness of K, we may choose finitely many 
of these cubes, say E),...,E,, such that K C LU; &; and 6(£;) > 6(£i41). We 
pick a disjoint subfamily of E; as follows: 71 = 1; 72 is the first index > 7, such 
that E;, does not meet F;,; 73 is the first index > ig such that £;, does not meet 
&;, and E;,, etc. Let V; be the closed ball centered at the center p; of Ei, with 
diameter 3(£;,). If y, denotes the ratio of the volumes of a ball and a cube in 
R* with the same diameter, then m(V;) = 74.3*m(Ej, ). 

For each i = 1,...,n, there exists 7; <i such that E; meets E;,, say, at some 
point q. Then for all y € E;, 


3.6. Measures on R*; differentiability 99 
since i; <7 implies that 6(£;) < 6(£;,). Hence E; C Vj, and 
KcURCUY,. 
a J 


Therefore, 


< So m( ee i,) < 13"e oe i) 
J 
= ste Bi, ) 
j 


Each £;, is an open cube of diameter < r containing some point of K; therefore, 
J, ¢ {ys dy, K) <r} = K, 
J 


The (open) set K,. has compact closure, and therefore (K;.) < oo by hypothesis, 
and by the preceding calculation 


m(K) < 438" u(K,). (2) 


Take r = 1/N (N €N); {Kiyw}nen is a decreasing sequence of open sets of 
finite -measure with intersection K; therefore w(K) = limy u(Ky/y), and it 
follows from (2) that m(K) < vi3*e-p(K). Hence 


(Ac) > w(K) > 7, 13-*e m(K) > 0. 


We proved therefore that m(A.) > 0 implies u(A.) > 0. Consequently, if 
p(A) = 0 (so that (A) = 0 for all c > 0), then m(A,) = 0 for all c > 0. 
Since AN [Dy > 0] = Up21 A1/p, it then follows that m(AM [Dy > 0]) = 0. But 
Dy > 0 since ps is a positive measure. Therefore Du = 0 m-a.e. on A (for each 
A € B with p(A) = 0). Hence 0 < Du < Du = 0 m-a.e. on A, and we conclude 
that Dy exists and equals zero m-a.e. on A (if u(A) = 0). 

If 4 Lm, there exists A € B such that w(A) = 0 and m(A‘) = 0 

Then m([Du > 0] A) = 0 and trivially m([Du > 0] N Ac) = 0. Hence 
m([Dpu > 0]) = 0, and therefore Du = 0 m-a.e. 

If is a complex Borel measure, we use its canonical (Jordan) decomposition 
b= uae i*up, where pz are finite positive Borel measures. If 1 | m, also 
Le L m for all k, hence Dux = 0 m-a.e. for k = 0,...,3, and consequently 
Du= 4 i* Dup = 0 m-a.e. 

2. Let uw be a real Borel measure absolutely continuous with respect to m 
(restricted to B), and let h = du/dm be the Radon—Nikodym derivative (h is 
real m-a.e., and since it is only determined m-a.e., we may assume that h is a 
real (Borel) function (in L'(R*)). We claim that 


m([h < Dyj) = 0. (3) 


100 3. Measure and topology 


Assuming the claim and replacing 4. by —y (so that h is replaced by —h), since 
D(—p) = —Dp, we obtain m([h > Dy]) = 0. Consequently 


h<Du< Du <h ma.e., 


that is, uw is differentiable and Du = h m-a.e. The case of a complex Borel 
measure pp < m follows trivially from the real case. Finally, if w is an arbitrary 
complex Borel measure, we use the Lebesgue decomposition pp = [a + Us as in 
Theorem 1.45. It follows that yu is differentiable and Du = Dug+Dps = dug /dm 
m-a.e. (cf. Part 1 of the proof), as wanted. 

To prove (3) it suffices to show that E, := [h <r < Duj(= [h < r]|N[Du > r)) 
is m-null for any rational number r, because [h < Du] = U,co Er. Fix r € Q, 
and consider the positive Borel measure 


\(E) = i: (h—r)dm (EF €B). (4) 
En[h>r] 


Since h € L'(m), A is finite on compact sets, and \([h < r]) = 0. By Part 1 of 
the proof, it follows that DA = 0 m-a.e. on [h <r]. For any E' € B, 


=f nam= [n—r)+riam= f (r—rjam+rm(e) 


= | (h—rjam+ f (h—r)dm+rm(E) < (£)+rm(E). 
En{h>r] En{h<r] 


Given 2 € R*, we have then for any open cube E containing x 
y & 


W(E) 
m(E) 


A(E) 
m(E) 


< 


Taking the supremum over all such E with 6(£) < s and letting then s — 0, we 
obtain 7 7 
(Du)(a) < (DA)(a) +r =r 


m-a.e. on [h <r]. Equivalently, m([h < r]M [Du > r]) = 0. 


Corollary 3.29. If f € L1(R*), then 


as m/( By fis f(y x)| dy =0 (5) 
for almost all x € R*. (The limit is over open cubes containing x.) 


In particular, the averages of f over open cubes F containing x converge 
almost everywhere to f(x) as 6(E) > 0. 


Proof. For each c € Q+ iQ and N €N, consider the finite positive Borel 
measure 


un(E) Sy lf—cldy (E€B), 
ENB(0,N) 


Exercises 101 


where B(0,N) = {y € R*;|y| < N}. By Theorem 3.28, py(E)/m(E) > 
| f(a) — eIp(o,n)(@) m-a.e. when the open cubes F containing x satisfy 6(E) > 
0. Denote the “exceptional m-null set” by Gen, and let 


G:= L{Gense€ Q+iQ, N EN}. 


We have m(G) = 0, and the proof will be completed by showing that (5) is valid 
for each x ¢ G. 

Let « ¢ G and « > 0. By the density of Q+ iQ in C, there exists cE Q+ iQ 
such that | f(x) — cl < e. Choose N > |z| +1. All open cubes containing x with 
diameter < 1 are contained in B(0, N), and therefore un (E)/m(E) > |f (a) —¢| 
when 6(£) — 0. Since 


BY ff) - fe)lay < m(B) [ If(y) — eldy 


E(=ENB(0,N)) 


-1 un (E) 
[te - dys 2 + 


lim sup m(E 1 fF) - x)| dy < |f(x) —¢| +e < 2e. 
6(E)0 


it follows that 


The arbitrariness of €« shows that the shown limsup is 0 for all « ¢ G, and 
therefore the limit of the averages exists and equals zero for all x ¢ G. 


Exercises 


Translations in L? 


1. Let L? be the Lebesgue space on R* with respect to Lebesgue measure. For 
each t € R*, let 


IT@fl(e) = fet+t) (fe Lae R*). 


This so-called “translation operator” is a linear isometry of L” onto itself. 
Prove that T(t)f > f in L?-norm as t > 0, for each f € L? (1<p<o). 
(Hint: use Corollary 3.21 and an “e/3 argument”.) 


Automatic regularity 


2. Let X be a locally compact Hausdorff space in which every open set is 
o-compact (e.g., an Euclidean space). Then every positive Borel measure \ 
which is finite on compact sets is regular. (Hint: consider the positive linear 
functional ¢(f) := fy f dd. If (X,M, y) is the associated measure space as 
in Theorem 3.18, show that A = yz on open sets and use Theorem 3.19.) 


102 3. Measure and topology 


Hardy inequality 


3. Let 1 < p < o, and let L?(Rt) denote the Lebesgue space for Rt := (0,00) 
with respect to the Lebesgue measure. For f € L?(R*), define 


(Tf)(2) = (1/2) i “p(i)dt (@ ER?) 


Prove: 


(a) Tf is well defined, and |(Tf)(x)| < x7~!/?||fllp. 


(b) Denote by D, M, and J the differentiation, multiplication by x, and 
identity operators, respectively (on appropriate domains). Verify the 
identities 

MDT =I-T onC?(R‘), (1) 


where multiplication of operators is their composition. 
Inrip=af frp tae (2) 
0 


for all f € Cy? (Rt), where q is the conjugate exponent of p. (Hint: 
integrate by parts.) 


(c) IZfllp <allfllp fe CF(R*). 
(d) Extend the (Hardy) inequality (c) to all f € L?(R*). (Hint: use 
Corollary 3.21.) 


(e) Show that suppyrere ||Tfllp/l|fllp = ¢. (Hint: consider the functions 
fr(2) = 2? Iy n)-) 


Absolutely continuous and singular functions 


4. Recall that a function f : R — C has bounded variation if its total variation 
function vy is bounded, where 


vp(x) == sup )> | f(x) — f(xn-1)| < 0, 
Pk 


and P = {a,z;k = 0,...,n}, ae-1 < Lp, Ln = x (the supremum is taken 
over all such “partitions” P of (—oo, z]). 

The total variation of f is V(f) := supp vf. 

It follows from a theorem of Jordan that such a function has a 
“canonical” (Jordan) decomposition f = ee i* f, where f; are non- 
decreasing real function. Therefore, f has one-sided limits at every point. 
We say that f is normalized if it is left-continuous and f(—oo) = 0. 


(a) Let 4 be a complex Borel measure on R. Show that f(x) := u((—0o, 2)) 
is a normalized function of bounded variation (briefly, f is NBV). 


Exercises 103 


(b) Conversely, if f is NBV and yp is the corresponding Lebesgue-Stieltjes 
measure (constructed through the Jordan decomposition of f as in 
Chapter 2, with left continuity replacing right continuity), then p 
(restricted to B := B(R)) is a complex Borel measure such that f(x) = 
1((—00, #)) for all x € R. (Also vy(x) = |u|((—00, )) and V(f) = |lul|-) 


(c) f :R—- C is absolutely continuous if for each € > 0 there exists 6 > 0 
such that whenever {(ax,b,);k =1,...,n} is a finite family of disjoint 
intervals of total length < 6, we have 5°, |f(bx) — f(ax)| < ¢. If f is 
NBV and uw is the Borel measure associated to f as in Part b., then 
p< miff f is absolutely continuous (cf. Theorem 3.28 and Exercise 8(f) 
in Chapter 1). 


(d) Let h € L1 := L(R), f(z) = f@ h(t)dt, and w(E) = f,h( 
(E € 6). Conclude from Parts (a) and (c) that : is ree 
continuous and Du = h m-a.e. (cf. Theorem 3.28). 


(e) Let wand f be as in Part (a), and let x € R be fixed. Show that (Dy) (x) 
exists iff f’(x) exists and f’(#) = (Dy)(x). In particular, if w L m, 
then f’ = 0 m-a.e. (such a function is called a singular function) (cf. 
Theorem 3.28). 


(f) With h and f as in Part (d), conclude from Parts (d) and (e) (and 
Theorem 3.28) that f’ = h m-a.e. 


(g) : Zé is NBV, show that f’ exists m-a.e. and is in Lt, and f(x) = 
)+ i. f'(t)dt where f, is a singular NBV fcaan, (Apply 
i (b), (e ), (f), and the Lebesgue decomposition.) 


Cantor functions 


5. Let {rn}P29 be a positive decreasing sequence with ro = 1. Denote r = 
limy Tn. Let Co = [0,1], and for n € N, let C,, be the union of the 2” 
disjoint closed intervals of length r,,/2” obtained by removing open intervals 
at the center of the 2”~! intervals comprising C,,_1 (note that the removed 
intervals have length (rn_1 — Tn)/2"-1 > 0 and m(C,) = rn). Let C = 
ene 


(a) Cis a compact set of ee measure 7. 


(b) Let gn =7,1Io, and fn(x =|" gn(t) dt. Then f, is ponEiniions, non- 
deciensine, constant on nan open Ne comprising C*, f,(0) = 0, 
fn(1) = 1, and f,, converge uniformly in [0,1] to some function f. The 
function f is continuous, non-decreasing, has range equal to [0, 1], and 
f’ = 0 on C*. (In particular, if r = 0, f’ = 0 mae., but f is not 
constant. Such so-called Cantor functions are examples of continuous 
non-decreasing non-constant singular functions.) 


104 3. Measure and topology 


Semi-continuity 


6. Let X be a topological space. A function f : X — R is lower semi- 
continuous (1.s.c.) if [f > c] is open for all real c; f is upper semi-continuous 
(u.s.c.) if [f < c] is open for all real c. Prove: 


(a) f is continuous iff it is both l.s.c. and ws.c. 
(b) If f is l.s.c. (u.s.c.) and a is a positive constant, then af is l.s.c. (u.s.c., 
respectively). Also —f is u.s.c. (l.s.c., respectively). 
(c) If f,g are lis.c. (us.c.), then f + is lis.c. (u.s.c., respectively). 
(d) The supremum (infimum) of any family of ls.c. (u.s.c.) functions is 
Ls.c. (u.s.c., respectively). 
(e) If {f,} is a sequence of non-negative l.s.c. functions, then f := 7, fn 
is l.s.c. 
(f) The indicator I, is l.s.c. (u.s.c.) if A C X is open (closed, respectively). 
(g) The following conditions are equivalent: 
e fisls.c; 
¢ for each net {ta} ,-,4 in X converging to x € X such that f(vq) < 
f(a) for each a € A, {f(2a)}ae4 converges to f(x); 
¢ for each net {ta}ge,4 in X converging to x € X we have f(x) < 
lim infge a f(®a); 
¢ for each net {ta} ye, in X converging to  € X such that 
{f (to) }ae4 converges in R we have f(x) < limgea f(a). 


When X is first countable, nets can be replaced by sequences in these 
conditions. 


7. Let (X,M, 1) be a positive measure space as in the Riesz—Markov theorem. 


(a) Let 0 < f € L*(u) and « > 0. Represent f = 07°, c;In, as in 
Exercise 15 in Chapter 1, and choose K; compact and V; open such 
that K; C Ej; C Vj; and p(V; — Kj) < €/cj2!t*. Fix n such that 
Dion CHH(E;) < €/2 and define u = D77_, clx, and v = S77 Gyly;. 
Prove that wu is ws.c., v is ls.c., u < f <v, and fy(v—u)du<e. 

(b) Generalize the conclusion in (a) to any real function f € L'(y). (This 
is the Vitali-Caratheodory theorem.) (Hint: Exercise 6) 


Fundamental theorem of calculus 


8. Let f : [a,b] > R be differentiable at every point of [a,b], and suppose f’ € 
L := L}({a, 6]) (with respect to Lebesgue measure dt). Denote si f'(t)dt = 
c and fix « > 0. By Exercise 7, there exists uv l.s.c. such that f’ < v and 
fo vdt <c+e. Fix a constant r > 0 such that r(b—a) < cte— f?vdt, 


Exercises 105 


and let g = u+r. Observe that g is l.s.c., g > f’, and f? gat <cte. By 
the l.s.c. property of g and the differentiability of f, we may associate to 
each x € [a,b) a number 6(x) such that g(t) > f’(x) and f(t) — f(x) < 
(t — x)[f’(x) + €] for all t € (a, 2 + 6(2)). 

Define 


F(a) = f° olt) dt — fla) + f(a) + ea), 
(F is clearly continuous and F(a) = 0.) 


(a) Show that F(t) > F(x) for all t € (a,x + d(a)). 
(b) Conclude that F'(b) > 0, and consequently f(b)— f(a) < c+e(1+b—a). 
Hence f(b) — f(a) <c. 


(c) Conclude that r f'(t) dt = f(b) — f(a). (Hint: replace f by —f in the 
conclusion of Part b.) 


Approximation almost everywhere by continuous 
functions 


9. Let (X,.M, 4) be a positive measure space as in the Riesz—Markov theorem. 
Let f : X — C be a bounded measurable function vanishing outside some 
measurable set of finite measure. Prove that there exists a sequence {g,} C 
C.(X) such that ||gn|lu < || fll. and gn — f almost everywhere. (Hint: 
Lusin and Exercise 16 of Chapter 1.) 


A 


Continuous linear 
functionals 


The general form of continuous linear functionals on Hilbert spaces was described 
in Theorem 1.37. In this chapter, we shall obtain the general form of continuous 
linear functionals on some of the normed spaces we have encountered. 

The first section introduces the starting point of operator theory: bounded 
linear operators and their norms, and we introduce them in the first section. The 
conjugate space X* of a normed space X is the normed space of all bounded, 
equivalently continuous, linear functionals on X. 

We prove two major representation theorems. The first describes the 
conjugate of the Lebesgue spaces. It says that there exists a natural isometric 
isomorphism between L?(y)* and L4(,1) for a positive measure ps, 1 < p < oo and 
q its conjugate exponent (or p = 1 and g = oo under an additional assumption). 
The second, called the Riesz representation theorem, says that for a locally 
compact Hausdorff space X, every element of C.(X)* (equivalently: Co(X)*) is 
induced by a complex Borel measure in the same way that a positive measure 
induces a positive (but not necessarily bounded!) functional on C,(X) as in the 
Riesz—Markov theorem (3.18). 

The final section presents another celebrated application of the Riesz— 
Markov theorem: the construction of the Haar measure. Every locally compact 
topological group G is shown to admit a (unique up to scaling) non-zero 
left translation invariant positive linear functional on C.(G), thus a non-zero 
left invariant positive measure on G. The Haar measure is a far-reaching 
generalization of the Lebesgue measure on R and is one of the cornerstones 
of abstract harmonic analysis. Chapter 5 presents an alternative proof of the 
existence and uniqueness of the Haar measure for compact topological groups. 


Introduction to Modern Analysis. Second Edition. Shmuel Kantorovitz and Ami Viselter, Oxford University Press. 
© Shmuel Kantorovitz and Ami Viselter (2022). DOI: 10.1093/os0/9780192849540.003.0004 


108 4. Continuous linear functionals 


4.1 Linear maps 
We consider first some basic facts about arbitrary linear maps between normed 
spaces. 


Definition 4.1. Let X,Y be normed spaces (over C, to fix the ideas), and 
let T : X > Y be a linear map (it is customary to write Tx instead of T(x), 
and norms are denoted by || - || in any normed space, unless some distinction is 
absolutely necessary). One says that T is bounded if 


Tx 
|Z'|| := sup ——— < ow. 
x70 ||| 
Equivalently, T is bounded iff there exists M > 0 such that 
|Tal| < M|2|| (we X), (1) 
and ||T'|| is the smallest constant M for which (1) is valid. In particular 
Tal] < ||T II lal (w@ € X). (2) 
The homogeneity of T shows that the following conditions are equivalent: 


(a) T is “bounded”; 


(b) the map T is bounded (in the usual sense) on the “unit ball” 
Bx := {x € X;||z|| < 1}; 


(c) the map T is bounded on the “closed unit ball” Bx := {a € X;|la|| < 1}; 
(d) the map T is bounded on the “unit sphere” Sx := {x € X;]||z|| = 1}. 
In addition, one has (for T bounded): 
|T|| = sup |Tal| = sup [Tal] = sup ||Te. (3) 
re Bx rTESx 


ceBy 


By (3), the set B(X,Y) of all bounded linear maps from X to Y is a complex 
vector space for the pointwise operations, and ||- || is a norm on B(X,Y), called 
the operator norm or the uniform norm. 


Theorem 4.2. Let X,Y be normed spaces, and T': X > Y be linear. Then the 
following properties are equivalent: 


(i) T € B(X,Y); 
(ti) T is uniformly continuous on X ; 


(iii) T is continuous at some point xp € X. 


Proof. Assume (i). Then for all 7, y € X, 


[Ta — Ty] = ||[T@—y) SIT Ile — gl, 


4.1. Linear maps 109 


which clearly implies (ii) (actually, this is the stronger property: T is Lipschitz 
with Lipschitz constant ||T||). 
Trivially, (ii) implies (iii). Finally, if (iii) holds, there exists 6 > 0 such that 
||Tx —Txo|| <1 
whenever ||a — xo|| < 6. 
By linearity of T, this is equivalent to: ||T'z|| < 1 whenever z € X and 


\|z|| < 6. Since ||éa|| < 6 for all x € Bx, it follows that 6|/T2x|| = ||T'(6x)|| < 1, 
that is, ||T'z|| < 1/6 on Bx, hence ||T'|| < 1/6. 


Notation 4.3. Let X be a (complex) normed space. Then 
B(X) = B(X, X); 
X* := B(X,C). 


Elements of B(X) will be called bounded operators on X; elements of X* will be 
called bounded linear functionals on X, and will be denoted usually by x*,y*,.... 


Since the norm on C is the absolute value, the norm of x* € X* as defined 
in Definition 4.1 takes the form 


I|z*|| = sup |2*a|/||2|| = sup |x*a; 
«x40 ce Bx 


also, (2) takes the form 
|2*2] < |lx"|[llzl| (@ € X). 
The normed space X™ is called the (normed) dual or the conjugate space of X. 


Theorem 4.4. Let X,Y be normed spaces. If Y is complete, then B(X,Y) is 
complete. 


Proof. Suppose Y is complete, and let {T,,} C B(X,Y) be a Cauchy sequence. 
For each xz € X, 


||Tna — Tm2|| = ||(Tn — Tm)al] < ||Tn — TFmllllal| + 0 


when n,m —> oo, that is, {T,,7} is Cauchy in Y. Since Y is complete, the limit 
lim, T,x exists in Y. We denote it by Tx. By the basic properties of limits, the 
map T': X — Y is linear. Given € > 0, there exists ng € N such that 


Tn -—Imll<¢, n,m > no. 
Therefore 
||T,t —Tmal| <ellz|], n,m >no, cE xX. 
Letting m — oo, we get by continuity of the norm 
||Tn«—Ta|| <ellz||, n>, rex. 


In particular T,, —T € B(X,Y), and thus T = T,, — (T, — T) € B(X,Y), and 
|Z, — T|| < € for all n > no. This shows that B(X,Y) is complete. 


Since C is complete, we have the following: 


Corollary 4.5. The conjugate space of any normed space is complete. 


110 4. Continuous linear functionals 


4.2 The conjugates of Lebesgue spaces 


Theorem 4.6. 


(i) Let (X,A,p) be a positive measure space. Let 1 < p < o, let q be 


the conjugate exponent, and let @ € L?(u)*. Then there exists a unique 


element g € L4(1) such that 


o(f) = I fodu (f € IP(w)). (1) 


Moreover, the map ¢ > g is an isometric isomorphism of L?()* and 
L4(h1). 
(ti) In case p = 1, the result is valid if the measure space is o-finite. 


Proof. Uniqueness. If g,g’ are as in the theorem, and h := g—g’, then h € L4(p) 
and 


[ sran=o (Fe 2). (2) 
The function 6: C > C defined by 
O(z) =|z|/z for z £0; 6(0) =0 


is Borel, so that, in case 1 < p < 06, the function f := |h|?~!0(h) is measurable, 


and 
fisrau= fi niePrau =f jalan < oo, 
xX xX xX 


Hence by (2) for this function f, 


o= f lay t(ayndu =f niedy, 
xX xX 


and consequently h is the zero element of L4(j1). 

In case p = 1, take in (2) f = Iz, where E € A and 0 < pp(E) < o (so that 
f € L*(u)). Then 0 = (1/p(E)) f;, hdp for all such E, and therefore h = 0 a.e. 
by the Averages lemma (Lemma 1.38) (since we assume that X is o-finite in case 
p=l1). 

Existence. Let (X,A,) be an arbitrary positive measure space, and 
1<p<oo. If g € L%(p) and we define w(f) by the right-hand side of (1), 
then Holder’s inequality (Theorem 1.26) implies that w is a well-defined linear 
functional on L?(), and 


WP) < Mgllallfll> (fF € 2°), 
so that ¢ € L?(y)* and 
Ill < llglle- (3) 


In order to prove the existence of g as in the theorem, it suffices to prove the 
following: 


4.2. The conjugates of Lebesgue spaces 111 


Claim. There exists a complex measurable function g such that 


IIglla < lloll (4) 


and 


Hie i" gdu (Ee Ad), (5) 
where Ap := {E € A; u(E) < oo}. 


Indeed, Relation (5) means that (1) is valid for f = Ip, for all E € Ap; by 
linearity of @ and %, (1) is then valid for all simple functions in L?(). Since 
these functions are dense in L?(u) (Theorem 1.27), the conclusion ¢ = w follows 
from the continuity of both functionals on L?(y), and the relation ||g||q = ||¢|| 
follows then from (3) and (4). 


Proof of the claim. Case of a finite measure space (X,A,). In that case, 
Ip € L?(p) for any E € A, and ||Ip||) = u(E)'/”. Consider the trivially additive 
set function 

ME) := Up) (BEA). 
If {Ex} C A is a sequence of mutually disjoint sets with union FE, set An = 
Lees Ex. Then 


Za, — Tally = |Ze—anllp = #(B - An)? + 0 


as n — oo, since {EF — A,} is a decreasing sequence of measurable sets with 
empty intersection (cf. Lemma 1.11). Since ¢ is continuous on L?(j:), it follows 
that 


AME) = (Ip) = lim $(La,) = lim \(An) 


= lim > A(Ex) = 35 (En), 
k=1 


so that » is a complex measure. 

If u(E) = 0 for some EF € A, then ||Iz||, = 0, and therefore \(£) := 
o([g) = 0 by linearity of ¢. This means that X < wy, and therefore, by the 
Radon-Nikodym theorem, there exists g € L'() such that 


(te) =f adn = | trod (Ee A). 


Thus (5) is valid, with g integrable. We show that this modified version of (5) 
implies (4) (hence the claim). 

By linearity of ¢ and the integral, it follows from (5) (modified version) that 
(1) is valid for all simple measurable functions f. If f is a bounded measurable 
function, there exists a sequence of simple measurable functions s,, such that 
Sn — flu — 0 (cf. Theorem 1.8). Then 


len — fllp S Ilsn - flluu(X)/” > 0, 


112 4. Continuous linear functionals 


and therefore, by continuity of ¢, 
é(f) = lim (s,) = lim [ Sng a. 
n n xX 
Also 
[| soadu— fi tadul < tn — Alda +0 
x x 


and we conclude that (1) is valid for all bounded measurable functions f. 
Case p = 1. For any E € A with p(E) > 0, 


1 _ |¢UZe)| [Zell _ 
ces f,2%4| = “heey <O cey = 


Therefore, |g| < ||¢]| a.e. (by the averages Lemma), that is, 


IIglloo < Ill, 


as desired. 

Case 1 < p < ow. Let FE, := [lg| < n] (n = 1,2,...). Define f, := 
Tz, |g|\%10(g), with @ as in the beginning of the proof. Then f,, are bounded 
measurable functions, so that by (1) for such functions, 


| lal? du = i, fng ds = $(fa) = |6(Fa)| < [lll fall: 
En x 
However, since |f,,|? = Iz,|g|°¢~V" = In, |g|%, it follows that 


I fal? = [ lal? dy. 


n 


(| "de < Il. 
( [ Tel du) < lll. 


Since 0 < Ip,|g|4 < Ixn,|g/4 < --- and lim, Iz, |g|4 = |g|%, the monotone 
convergence theorem implies that ||g||q < ||@||, as wanted. 

Case of a o-finite measure space; 1 < p < oo. We use the function w and 
the equivalent finite measure dv = w dy (satisfying v(X) = 1), as defined in the 
proof of Theorem 1.40. Define 


Therefore 


that is, 


Vp: EP(v) + E(u) 


by 
Vaf wf: 


4.2. The conjugates of Lebesgue spaces 113 


Then 

IVeF log =f UePwdu= f isPav Welk, 
so that V, is a linear isometry of L?(v) onto L?(u). Consequently, ¢ o V, € 
L(v)*, and ||¢ 0 V,|| = ||| (where the norms are those of the respective 


dual spaces). Since v is a finite measure, there exists (by the preceding case) 
a measurable function g; such that 


IIgillz2) S 116° Voll = loll, (6) 


(60Vp)(f) = : fndy (fe 1). (7) 


Thus, for all F € Apo, 
(Ip) = ($0 Vp)(w-¥/? Ip) = [wren dy = [wren dp. (8) 


In case p > 1 (so that 1 < gq < 00), set g = w!/%gi(= Vygi). Then (5) is valid, 
and by (6), 
IIgllz2(u) = Ilgillzew < loll, 
as desired. 
In case p = 1 (so that q = 00), we have by (8) dz) = Ji, 91 du. Thus (5) is 
valid with g = gi, and since the measures yz and v are equivalent, we have by (6) 


Igllz~o qu) = Igillz~w) < lel, 


as wanted. 

Case of an arbitrary measure space; 1 < p < co. For each E € Ag, consider 
the finite measure space (E,AM E,y), and let L?(E) be the corresponding 
L?-space. We can identify L?(E) (isomorphically and isometrically) with the 
subspace of L(y) of all elements vanishing on E°, and therefore the restriction 
ge := ¢|1»(H) belongs to L?(£)* and ||¢z|| < ||¢||. By the finite measure case, 
there exists gm € L4(E) such that 


IIgzllzo(e) = llozll(S ll4ll) 


and 


gr(f) = [ foe du for all f € L?(E). 


If E,F € Ao, then for all measurable subsets G of EN F, Ig € L°(EN F) 
Cc L?(E), so that dz(I¢) = denr(Ia), and therefore 


i) (ge — genF) du = | Tagn dp — | Iggenr du 
G E ENF 


= bela) — ¢enr(Ua) = 9. 


114 4. Continuous linear functionals 


By Proposition 1.22 (applied to the finite measure space (EN F, AN(ENF), p)) 
and by symmetry, gz = genr = gr a.e. on ENF. It follows that for any mutually 
disjoint sets E,F € Ag, geur coincides a.e. with gz on EF and with gr on F, 
and therefore 


ieeeale= | weal de 
EUF 


=f ign! du + f \gel® du = geld + lor 
E F 
that is, ||gz||g is an additive function of F on Ao. Let 


Kk := sup |lgallq(S Il¢l)), 
EEA 


and let then {£,,} be a non-decreasing sequence in Ao such that ||¢z, || > K. 
Set F =U, En. 

If E € Ag and ENF = 9, then since E and Ey, are disjoint for all n, it follows 
from the additivity of the set function ||gz||Z that 


IIgellg = Ilgeuz, lg — Ilge, lg < 4 — Ildz, |? > 0. 


Hence, ||ge||¢ = 0 for all E € Apo disjoint from F, that is, gg = 0 ae. for such 
E. Consequently, for E € Ap arbitrary, we have a.e. on FE — F gez = gp_r = 0, 
and therefore 


oz) = on(z) -| Jn dp = | ge dy =a genF a. (9) 
E ENF ENF 
Since gz, = JE,4, ae. on E,, the limit g := lim, gz, exists a.e. and vanishes 
on F’°; it is measurable, and by the monotone convergence theorem, 
lallas() = lim [19x lua) = lim lox, || = K < [lo 


and (4) is verified. Fix n. For all k > n, genr = gn, a.c. on (ENF )NE, = ENE, 
hence (a.e.) on EM E,,. Therefore, genr = g ae. on EM E,, for all n, hence 
(a.e.) on EMF, and consequently (5) follows from (9). This completes the proof 
of the claim. 


4.3 The conjugate of C.(X) 


Let X be a locally compact Hausdorff space, and consider the normed space 
C.(X) with the uniform norm 


[FIl = Ilflle = sup [fl (f € Ce(%)). 
If is a complex Borel measure on X, write du = hd|y|, where || is the 


total variation measure corresponding to js and h is a uniquely determined Borel 
function with |h| = 1 (cf. Theorem 1.46). Set 


wf = a fdpe= i. fhdlul (f € Co(X)). (1) 


4.3. The conjugate of C,(X) 115 


Then 
wis f lfldlul < |u\(X)IIFIl, (2) 
x 


so that w is a well-defined, clearly linear, continuous functional on C.(X), with 
norm 


el < Well = lel). (3) 
We shall prove that every continuous linear functional ¢ on C,(X) is of this form 
for a uniquely determined regular complex Borel measure ju, and ||¢|| = ||/4||. This 


is done by using Riesz—-Markov representation theorem 3.18 for positive linear 
functionals on C,.(X). Our first step is to associate a positive linear functional 
|d| to each given ¢ € C.(X)*. 


Definition 4.7. Let ¢ € C.(X)*. The total variation functional |¢| is defined by 
lOl(f) = sup{|o(h)|;h € C(X), [hl < f} (OS fe C(X)); 
|d|(u + iv) = |d|(ut) —|d|(u~) + i]gl(vt) —i]d|(~) (u,v € C8 (X)). 


Theorem 4.8. The total variation functional |¢| of 6 € C.(X)* is a positive 
linear functional on C.(X), and it satisfies the inequality 


lo) s leldf) < llellfll Gf € Cc(X)). 
Proof. Let Cr (X) := {f € C.(X); f > 0}. It is clear from Definition 4.7 that 


0<|4l(f) < IlelIlFIl < 0, (4) 


|¢| is monotonic on C7 (X) and |¢|(cf) = clé|(f) (and in particular |¢|(0) = 0) 
for allc > 0 and f € C+(X). We show that |d| is additive on C7 (X). 

Let « > 0 and fy € Ct(X) be given (k = 1,2). By definition, there exist 
hy € C-(X) such that |hy| < fr, and |é|(f,) < |O(hx)| + €/2, & = 1,2. Therefore, 
writing the complex numbers ¢(h,) in polar form, we obtain 

0 < |9l(f1) + |Ol(f2) < |o(hi)| + |o(h2)| + € 
= e 11 6(h1) + e 192. 6(h2) +Ee= b(e hy + e192 ha) +€ 


=| fa) +e, 


because 


Je“ hy tem hg] < [hal + [hol < fr + fe. 

Hence |¢| is “super-additive” on Ct (X). 
Next, let h € C.(X) satisfy |h| < fi + fo := f. Let V = [f > 0]. Define for 

k=1,2 

hr — (fr/f)h on V; hr =0 onV*. 
The functions hy, are continuous on V and V°. If x is a boundary point of V, 
then « ¢ V (since V is open), so that f(x) = 0 and hy(x) = 0. Let {rq} C V be 
a net converging to x. Then by continuity of h, we have for k = 1, 2: 


[hx (@a)| S |A(#a)| > [h(a)| < |F(@)| = 0, 


116 4. Continuous linear functionals 


so that limy hg (aq) = 0 = hg(x). This shows that hy are continuous on X. 
Trivially, supp h, C supp f,, so that h, € C.(X), and by definition, |hz| < fz 
and h = h, + ho. Therefore 


|o(h)| = |o(rr) + (h2)] < |ol(ft) + 11/2). 


Taking the supremum over all h € C.(X) such that |h| < f, we obtain that |¢| 
is subadditive. Together with the super-additivity obtained before, this proves 
that |d| is additive. 

Next, consider || over C®(X). The homogeneity over R is easily verified. 
Additivity is proved as in Theorem 1.19. Let f = ft — f~ and g=gt —g™ be 
functions in C#(X), and let h = ht —h~ := f+g=ft—f-+gt—g-. Then 
ht+f—+g =ft+gt +h’, so that by the additivity of || on Ct (X), we 
obtain 


|\(h*) + lal(F) + lel(g) = lel(F*) + lel(g*) + lel(e), 


and since all summands are finite, it follows that 


[d|(h) = lal(h*)—l6l(h-) = lal(f*)-14l(F)+14l(97 lela) == lel(F)+1¢1(9). 


The linearity of |¢| over C.(X) now follows easily from the definition. Thus |¢| 
is a positive linear functional on C.(X) (cf. (4)). By (4) for the function |/f\, 
lAl(f1) < lelll| fl]. Also, since h = f belongs to the set of functions used in the 
definition of |¢|(|f]), we have |@(f)| < |¢l(IfI)- 


4.4 The Riesz representation theorem 


Theorem 4.9. Let X be a locally compact Hausdorff space, and let 6 € C.(X)*. 
Then there exists a unique regular complex Borel measure pp on X such that 


a(t) =f fa (f € C.(X)). (1) 


Furthermore, 
loll = Well. (2) 


“Regularity” of the complex measure ys means by definition that its total 
variation measure || is regular. 


Proof. We apply Theorem 3.18 to the positive linear functional |¢|. Denote by 
A the positive Borel measure obtained by restricting the measure associated with 
|d| (by Theorem 3.18) to the Borel algebra B(X) C M. Then 


Jolin) = f fan (Fe C.(X)). (3) 
xX 
By Definition 3.5 and Theorem 4.8, 

A(X) = sup{|¢|(f);0 < f < 1} < |Idl. (4) 


4.4. The Riesz representation theorem 117 


In particular, every Borel set in X has finite \-measure, and therefore, by 
Theorem 3.18 (cf. (3) and (4)(ii)), is regular. 
By Theorem 4.8 and (3), for all f € C.(X), 


lof) s leldfl) = I [fl dA = llfllztay- 


This shows that ¢ is a continuous linear functional on the subspace C.(X) of 
L(A), with norm < 1. By Theorem 3.21, C.(X) is dense in L'(A), and it follows 
that @ has a unique extension as an element of L1(\)* with norm < 1. By 
Theorem 4.6, there exists a unique element g € L™(A) such that 


=f toa (f € C.(X)) (5) 


and |Igllo <1. 
Define du = gdX. Then yu is a complex Borel measure satisfying (1). By 
Theorem 1.47 and (4), 


lul(X )= f Isler < A(X) < lol 


By (3) of Section 4.3, the reversed inequality is a consequence of (1), so that (2) 
follows. 
Gathering some of these inequalities, we have 


Denes )= f Ialar< MX) < I 


Thus, A(X) = fy |g| dA, that is, f,(1 — |g|) dA = 0. Since 1 — |g| > 0 d-a.c., it 
follows i |g| = 1 ae., and since g is only a.e.-determined, we may ahoose g 
such that |g| = 1 identically 7 Xx. 

For all Borel sets E, |u|(Z) = f,|g|d\ = ACE), which proves that |u| = d. 
In particular, yu is regular. 

In order to prove uniqueness, we observe that the sum v of two finite positive 
regular Borel measures 14, is regular. Indeed, given « > 0 and EF’ € B(X), there 
exist K;, compact and V; open such that kK, C EC V; and 


Vz (Vz) _ €/2 < Vy, (E) < Vz, (Kr) + €/2. 


Then K := KiU Ko CE CV := Vi V2, K is compact, V is open, and by 
monotonicity of positive measures, 


V(V) —€<m(W) + 12(Vo) —€ < (FE) < (Ky) + (Ko) +6 < v(K) +6. 


Suppose now that the representation (1) is valid for the regular complex measures 
jy and pz. Then f,, f du = 0 for all f € C.(X), for yu = p11 — ja. We must show 
that ||,|| = 0 (i.e., 4 = 0). Since |u| are finite positive regular Borel measures, 
the positive Borel measure v := |/11| +|2| is regular. Write du = hd|ju|, where h 


118 4. Continuous linear functionals 


is a Borel function with |h| = 1 (cf. Theorem 1.46). Since v is regular, it follows 
from Theorem 3.21 that there exists a sequence {f,} C C.(X) that converges 
to h in the L'(v)-metric. Since hh = 1, |u| = |u1 — we| < [wi] + |we| = v, and 
Sy frrdlu| = fy fr du =0, we obtain 


Nal == WaliX)=| fo faralal— f kratal| =| f fe — Ayan 


< [ Un Fid\ul < f Ad =A 0 
xX x 


as n — oo. Hence ||1|| = 0. 


Remark 4.10. If S = supp|y| (cf. Definition 3.26), we have 


Ol = Well == [el() = [el(S) 


and 


[tea [fan (f € E'(\u))). 
xX Ss 


The second formula follows from Theorem 1.46 and Definition 3.26(2). 
Indeed, write du = hd|| where h is a Borel measurable function with |h| = 1 on 
X (cf. Theorem 1.46). Then for all f € L*(|u|), we have (cf. Definition 3.26(2)) 


[ fanm [sean = f fray = [tan 


4.5 Haar measure 


As an application of the Riesz—Markov representation theorem for positive linear 
functionals (Theorem 3.18), we shall construct a (left) translation-invariant 
positive measure on any locally compact topological group. 

A topological group is a group G with a Hausdorff topology for which the group 
operations (multiplication and inverse) are continuous. It follows that for each 
fixed a € G, the left (right) translation x + ax(x — xa) is a homeomorphism of 
G onto itself. For any open neighborhood V of the identity e, the set aV (Va) 
is an open neighborhood of a. 

Suppose G is locally compact, and f,g € Ct := C+(G) :={f € C.(G); f > 0, 
f not identically zero}. Fix 0 < a < ||g|| := ||g||u- There exists a € G such that 
g(a) > a, and therefore there exists an open neighborhood of e, V, such that 
g(x) > a for all x € aV. By compactness of supp f, there exist 71,...,0% € G 
such that supp f C ies tpV. Set spi= anes Then for x € x, V, sex € aV, 
and therefore g(s,2) > a. If x € supp f, there exists k € {1,...,} such that 
x € £xV, so that (for this k) 


fo) <1 < A a(sie) < Y eigl sia), (1) 


i=l 


4.5. Haar measure 119 


where c; = ||f||/a for all 7 = 1,...,n. Since (1) is trivial on (supp f)°, we see 
that there exist n € N and (c1,..-,€n;$1,--+;8n) € (R*)” x G” such that 


n 


f(z) < Dl ag(sie) (we G). (2) 


i=l 


Denote by Q(f : g) the non-empty set of such rows (with n varying) and let 


(f 2g) = inf) Ji, (*) 
i=1 
where the infimum is taken over all (c1,...,¢n) such that (c1,...,Cn,$1,---, $n) 


€ O(f : g) for some n and 5;. 

We verify some elementary properties of the functional (f : g) for g fixed as 
above. 

Let f,(x) := f(sx) for s € G fixed (f, is the so-called left s-translate of f). 
If (Ciyiss Gay Siye.s4 8a) € OVF tg), then, f,(e) = f(sx)s 35, e9( sex) for all 
x € G, hence (c1,...,€n,$18,---,$n8) € O(fs : g), and consequently (f; : g) < 
oy, ci. Taking the infimum over all such rows, we get (fs: g) < (f : g). But 
then (f : g) = ((fs)s-1 : 9) < (fs : g), and we conclude that 


(fs:9) =(f39) (3) 


for all s € G (i.e., the functional (- : g) is left translation invariant). 

In the following arguments, € denotes an arbitrary positive number. 

If c > Oand (c1,-..-,Cn,$1,---;8n) € Q(f +g) is such that Soc; < (f: 9) +e, 
then (cf)(a) < )0, ccig(siv) for all x € G, and therefore (cf : g) < U,cci < 
c(f : g) + ce. The arbitrariness of € implies that (cf : g) < c(f : g). Applying 
this inequality to the function cf and the constant 1/c (instead of f and c, 
respectively), we obtain the reversed inequality. Hence 


(cf: g)=e(f:9) (c>0). (4) 


Let zo € G be such that f(xo) = maxg f (since f is continuous with compact 
support, such a point xo exists). Then for any (c1,...,C€n,$1,---,;8n) € Q(f : 9), 


Il fll = f(vo) < > ci9(si0) < |lgll ei: 


Hence 
ual ( 
Ilgl| 
Next, consider three functions fi, fo,g € Ct. If fi < fe, one has trivially 
O( fe: g) C Q(fi : g), and therefore 


fg). (5) 


fi S< fo implies (fi: 9) < (fe +9). (6) 


120 4. Continuous linear functionals 


There exist (c1,..-,Cn,$1,---,;$n) € OQ(fi : g) and (di,...,dm,ti,...,tm) 
€ OQ(f2 : g) such that 7 < (fi : g) +e/2 and Dod; < (fo : g) + 6/2. 
Then for all z € G, 


n+tm 


fila) + fala) < deat six )+ di dial (tj) = D> c.9(s,2), 
k=1 
where cy, = Cx,S, = Sp for k = 1,...,n, and ¢, = dyin, s, = tren for 
k=n+1,...,n+m. Thus (c,...,¢, ee ect) € O(f1 + fo: g), and 
therefore 


n+tm 


(itf:g< So g&=doat dod + (fag) + 
k=1 4=1. 


This proves that 

(fit fo: g) < (frig) + (fo:9). (7) 
Let (c1,.--,€n;$1,--+;8n) © Q(F : g) and (di,...,dm,ti,...,tm) € Q(g : h), 
where f,g,h € C+. Then for all x € G, 


uy< S- cig(sixz) < pe Cj S- d;h(t;s;x) = SS cd h(t; six) 
a i j ij 


that is, (c:d;, tj $:)i=1,....nj=1,....m € O(f +h), and consequently 


oer 


(f :h) S28) = (So4){ Fay). 


Taking the infimum of the right-hand side over all the rows involved, we conclude 


that 
(fh) <(f:9)(g:h). (8) 
With g fixed, denote 
fch 


Since A, is a constant multiple of (f : h), the function f > A;f satisfies (3), 
(4), (6), and (7). 
y (8), Anf < (f : 9). Also 


: ihinty Re) 5 
(9: DAnf = (9: DCF) a ee =1. 
Hence ; 
ah) SAPS 20). (10) 


By (10), Ay, is a point in the compact Hausdorff space 


a= Tl |g) 


4.5. Haar measure 121 


(cf. Tychonoff’s theorem). Consider the system V of all (open) neighborhoods of 
the identity. For each V € Y, let Ny be the closure in A of the set {A;,;h € CH}, 
where Cy) := Ci/(G) consists of all h € Ct with support in V. Then Ly is 
a non-empty compact subset of A. If Vi,...,Vn € V and V := (\j_, Vi, then 
Cy. Cl Cy, , and therefore, Ly C (), Xv,. In particular, the family of compact 
sets {Hy;V € V} has the finite intersection property, and consequently 


() =v #0. 


VeVv 


Let A be any point in this intersection, and extend the functional A to Cy := 
C.(G) in the obvious way (AO = 0; Af = Aft — Af~ for real f € C., and 
A(u+iv) = Au+idv for real u,v € C2). 


Theorem 4.11. A is a non-zero left translation invariant positive linear 
functional on Ce. 


Proof. Since A € A, we have 


f) 


for all f € Ct, so that in particular Af > 0 for such f, and A is not 
identically zero. 

For any V € VY, we have A € Sy; hence every basic neighborhood N of A in 
A meets the set {An;h € Cit}. Recall that 


ve| aa] 


where f; € C+. Thus, for any V € V and fi,..., fn € Ct, there exists h € Cit 
such that 


|An fi —Afil < € C= Vem). (11) 


Given f € Ct and c > 0, apply (11) with f; = f and fz = cf. By Property (4) 
for A,, we have 


|A(cf) — cAf| < |A(cf) — An(cf)| + dAnf— Af] < (1+ cje, 
so that A(cf) = cAf by the arbitrariness of e. 
A similar argument (using Relation (3) for A;,) shows that Af, = Af for all 


feCt ands eéG. 
In order to prove the additivity of A on C+, we use the following: 


Lemma. Let fi, f2 € Ci(G) and e>0. Then there exists V € V such that 
Anfi + Anfa < An(fi + fe) +€ 


for allh € CE(G). 


122 4. Continuous linear functionals 


Proof of lemma. Let f = fi + fo, and fix k € Ct(G) such that k = 1 on 
{x € G; f(x) > 0}. For g fixed as shown, let 


= — y= min f 7A 12h, 


Thus 
2n(f:g)<«/2; 2n <1; 26(k:g) < 6/2. (12) 


For 7 = 1,2, let hj := f;/F, where F := f + dk (h; = 0 at points where F = 0). 
The functions h; are well defined, and continuous with compact support; it 
follows that there exists V € V such that 


|hi(a) — hill <n (= 1,2) 


for all 2, y € G such that y~!x € V (uniform continuity of hj!). 

Let h € Ch(G). Let (e1,...,6n:81;---:5n) € O(F : h) and x € G. 
If j € {1,...,n} is such that h(s;z) # 0, then s;z © V, and therefore 
|hy(a) — hi(s;')| <1 for i = 1,2. Hence 


hi(a) < |ha(w) — hi(sp")| + Pi(sz*) < i(sz*) +0 
Therefore, for i = 1, 2, 


fi(z)=F@)hlz)< > — qh(syz)hi(z) 
{ish(s;2) 40} 


< S- Cj [Ai(s;*) + nlh(sjx) < SS cih(s;x), 
{iih(sjx)A0} j=l 


where ci := c; [Ai(s5*)+n]. Hence (fj : hk) < }0, cj, and since hi tho = f/F <1, 


we obtain 


(fi th) + (farh) < Dell + 2n). 


Taking the infimum of the right-hand side over all rows in Q(F' : h), we conclude 
that 


(fr th) + (fo th) < (Fh) +2n) <[(f:h) +6(k: h)|(L+2n) by (7) and (4) 
= (f sh) + 2n(f sh) + 5(. + 2n)(k : ). 
Dividing by (g : h), we obtain 
Anfi + Anfo < Anf + 2nAnf + 6(1 + 27n)Ank. 
By (10) and (12), the second term on the right-hand side is < 2n(f : g) < €/2, 


and the third term is < 26(k: g) < €/2, as desired. 
We return to the proof of the theorem. 


4.5. Haar measure 123 
Given € > 0 and fi, fo € Cr, if V € V is chosen as in the lemma, then for 

any h € Ct, we have (by (7) for A;,) 
[An (fi + fo) — (Anti + Anfa)| < €- (13) 


Apply (11) to the functions f,, fo, and fg = f := fi + fo, with V as in the 
lemma. Then for A as in (11), it follows from (13) that 


|Af — (Afi + Afe)| < [Af — Anf] + |Anf — (Anfi + Anfe)| 
+ |Anfi — Afi] + |Anfo — Afal < 4e, 


and the additivity of A on C+ follows from the arbitrariness of e. 
The desired properties of A on C, follow as in the proof of Theorem 4.8. 


Theorem 4.12. If A’ is any left translation invariant positive linear functional 
on C-(G), then A’ = cA for some constant c > 0. 


Proof. If A’ = 0, take c = 0. So we may assume A’ # 0. Since both A and A’ 
are uniquely determined by their values on Cy (by linearity), it suffices to show 
that A’/A is constant on C+. Thus, given f,g € Cy not in ker A, we must show 
that A’f/Af = A’g/Ag. 

Let K be the (compact) support of f; since G is locally compact, there 
exists an open set W with compact closure such that kK C W. For each a € K, 
there exists W, € V such that the x-neighborhood «W, is contained in W. By 
continuity of the group operation, there exists V, € V such that V,V, C W,. By 
compactness of K, there exist 11,...,%, € K such that K C Uj_, viVz,. Let 
Vi = (ji, Ve; Then Vi € V and 


KV, Cc U Li Ve, Vi Cc U Li Vz, Vex; G U tiWz, CW. 


i=1 i=1 i=l 


Similarly, there exists V2 € V such that Vok CW. 
Let € > 0. By uniform continuity of f, there exist V3,V4 € V such that, for 
all x eG, 
|f(a) — f(sx)|<e¢/2 for alls e V3 


and 
|f(x) — f(at)| <¢/2 forall te V4. 


Let U := ea V; and V :=UnNU™~ (where U~? := {x7';a € U}). ThenV EV 
has the following properties: 

KVCW; VKCW; V-1=V; (14) 

|f(sa) — f(at)|<e forallxaeG, s,teV. (15) 


We shall need to integrate (15) with respect to x over G; since the constant € is 
not integrable (unless G is compact), we fix a function k € Ct such that k = 1 
on W; necessarily 


f(sx) = f(sx)k(x) and f(ws) = f(ws)k(x) (16) 


124 4. Continuous linear functionals 


for alla € Gand s € V. (This is trivial for x € W since k = 1 on W. If « € W, 
then x ¢ KV and z ¢ VK by (14); if sx € K for some s € V, then s~! € V, and 
consequently x = s~!(sx) € VK, a contradiction. Hence sx ¢ K, and similarly 
xs ¢ K, for all s € V. Therefore, both relations in (16) reduce to 0 = 0 when 
xcéW and seéV.) 

By (15) and (16) 

|f(ws) — f(sx)| < k(x) (17) 
for allaeé GandseV. 

Fix h’ € Cf not in ker A, and let h(x) := h’(x) + h’(a~1). Clearly, h ¢ 
ker A, since h > h’ and A is a positive linear functional, hence monotone. Let 
Lt,’ be the unique positive measures associated with A and A’, respectively 
(cf. Theorem 3.18). Since h(a~'y) f(y) € C-(G x G) C L*(u x p’), we have by 
Fubini’s theorem and the relation h(a~!y) = h(y~!z): 


/ | Hye) f(y) du!(@) duly) = / i h(a-3y) fv) duly) du!(e). (18) 


By left translation invariance of A’, the left-hand side of (18) is equal to 


i ( / h(y*2) au'(e)) F(y) duty) = y | h(x) du’ (x) f(y) duly) = A'hAF. (19) 


By left translation invariance of A, the right-hand side of (18) is equal to 


/ ( [roses ‘w(v)) du! (2), 


and therefore A’hAf equals this last integral. On the other hand, by left 
translation invariance of A’, 


[ (fron f(yx) du ( (2)) duu) = fry (rue) aula )) a u(y) 
= f cy) f #60) dul (2) duty) = Ana's. 


Since h has support in V, we conclude from these calculations and from (17) 
that 


|W hAS — AhA'f| = 


/ / h(u)LF (ou) — F(wx)] duly) dy! (a) 
weEG JyeV 


ce f my)b(x)duly)dul(e)=eAna'k. (20) 
wTeEG JyeV 


Similarly, for g instead of f, and k’ associated to g as k was to f, we obtain 


|A’hAg — AhA’g| < eARA‘R’. (21) 


4.5. Haar measure 125 


By (20) and (21) divided, respectively, by the positive numbers AhAf and AhAg, 
we have 


<e€ 
Ah Af|— Af 


Ee A'f| . Atk 


and 
Nh A’g i A'k! 
Ah Ag|—° Ag 
Consequently 


Nf Ag Nk NE! 
Se + , 
Af Ag Af Ag 


and the desired conclusion A’ f/Af = A’g/Ag follows from the arbitrariness of . 


Definition 4.13. The unique (up to a positive constant factor) non-zero left 
translation invariant positive linear functional A on C.(G) is called the (left) 
Haar functional for G. The measure y corresponding to A through Theorem 3.18 
is the (left) Haar measure for G. 


Thus, is the unique (up to a positive constant factor) non-zero positive 
measure with the properties listed in Theorem 3.18 (for the locally compact 
Hausdorff space G) such that J, fidu = fq f du for all f € C.(G) and t € G. 
Equivalently, ~ is the unique (up to a positive constant factor) non-zero positive 
measure with the properties listed in Theorem 3.18 such that for each measurable 
ECG and te G, the set tE is also measurable and p(t) = p(£). 

If Gis compact, its (unique up to a positive constant factor) left Haar measure 
(which is finite by Theorem 3.18(2)) is normalized so that G has measure 1. 

In an analogous way, there exists a unique (up to a positive constant factor) 
right translation invariant positive measure (as in Theorem 3.18) \ on G: 


[fasta (f € C.(G);t EG), (22) 
G G 


where f'(x) := f(at). 
Given the left Haar functional A on G and t € G, define the functional At on 
C. by A'f := Aft. Then A! is a non-zero left translation invariant positive linear 
functional (because A‘(f,) = A(fs)’ = A(f*)s = A(f*) := A'f), and therefore, 
by Theorem 4.12, 
Av = 6G7)A (23) 


for some positive number 6(t~+). The function 6(-) is called the modular function 
of G. Since (A‘)* = (A)*S, we have 
5((ts) “YA = (A) = 6(-2)AS = (2) 6(5-YA, 


that is, 6(-) is a homomorphism of G into the multiplicative group of positive 
reals. Furthermore, for each f € C, the function t > A‘(f) is continuous by 
Lebesgue’s dominated convergence theorem. Hence, by (23), 6 is continuous. 


126 4. Continuous linear functionals 


We say that G is unimodular if 6(-) = 1. If G is compact, applying (23) to 
the function 1 € C.(G), we get 6(-) = 1. If G is abelian, we have f* = f;, hence 
Atf =A(f,) =Af for all f € C., and therefore 6(-) = 1. Thus, compact groups 
and (locally compact) abelian groups are unimodular. So are discrete groups; see 
Exercise 11. 7 e : 7 

For f € Cz let f(x) := 6(2~1)f(w-1) and define A by Af := Af 
(f € C.). Then A is a non-zero positive linear functional on C,; it is left 
translation invariant because (f,)(z) = 6(a~')f,(a~+) = 6(a7")f(sa7t) = 


( 
5(s')6((ws1)“!)f((ws4)-1) = 6(s") f(ws!) = 6(s")(f)* (@), and 
therefore by (23), 


e a, ee ee 
A(fs) =A(fs) =6(s"")A® f =Af = Af. 
By Theorem 4.12, there exists a positive constant a such that A = aA. Since 


f= i for all f, we have A = A, hence a2 = 1, and therefore A = A. In terms of 
the Haar measure yz, this equality takes the form 


[5H aula) = f Feyaute) (FECL). 24) 
As a result, if G is ee cca a the a wa functional A is also inverse 
invariant, that is, f., f(a— = fof ) for all f € C.(G). 
Exercises 


1. Let X,Y be Banach spaces, Z a dense subspace of X, and T € B(Z,Y). 
Then there exists a unique T ¢ B(X,Y) such that T|z = T. Moreover, 
the map T — T is an isometric isomorphism of B(Z,Y) onto B(X,Y). 


2. Let X be a locally compact Hausdorff space. Prove that Co(X)* is 
isometrically isomorphic to M,(X), the space of all regular complex Borel 
measures on X. (Hint: Theorems 3.24 and 4.9, and Exercise 1.) 


3. Let X, k = 1,...,n be normed spaces, and consider [], X, as a 
normed space with the norm ||[x1,...,2]|| = 2, ||xx||. Prove that there 
exists an isometric isomorphism of ([], X;,)* and [], Xj with the norm 
|[z7,..., v7]|| = max, ||v% ||. (Hint: given ¢ € ([], Xx)*, define rir, = 
$([0,...,2%,0,...,0]) for a, € Xz. Note that O([171,...,%n]) = 0, Vir.) 


4. Let X be a locally compact Hausdorff space. Let Y be a normed space, and 
T € B(C.(X),Y). Prove that there exists a unique P : B(X) — Y** := 
(Y*)* such that P(-)y* € M,(X) for each y* € Y* and 


y'Tf = a fd(P(-)y*) 


for all f € C.(X) and y* € Y*. Moreover ||P(-)y*|| = ||y* o T]| for the 
appropriate norms (for all y* € Y*) and || P(6)|| < ||T'|| for all 6 € B(X). 


Exercises 127 


Convolution on L? 


5. Let L? denote the Lebesgue spaces on R* with respect to Lebesgue 
measure. Prove that if f € L' and g € L?, then f *g € L” and 
lf * all> < \fllillgllb- (Hint: use Theorems 4.6, 2.18, 1.33, and the 
translation invariance of Lebesgue measure; cf. Exercise 7, Chapter 2, 
in its R* version.) 


Approximate identities 


6. Let m denote the normalized Lebesgue measure on [—7,7]. Let 
Ky, : [-7,7] — [0,00) be Lebesgue measurable functions such that 
i K,,dm = 1 and 

sup K,(x) > 0 ‘e 
d<|al<m 
as n — ©, for all 6 > 0. (Any sequence {K,,} with these properties is 
called an approximate identity.) Extend K,, to R as 27-periodic functions. 
Consider the convolutions 
(Kn* f(t) := [| Krle—t)f(t)dm(t)= [| f(a —t)Kn(t) dm(t) 


with 27-periodic functions f on R. Prove: 


(a) If f is continuous, K, * f > f uniformly on [—7,7]. (Hint: f* = 
Siti<s + Sscten’) 

(b) If f € L? := L?(—n,7) for some p € [1, 00), then K, * f > f in L?. 
(Hint: use the density of C([—7, 7]) in L?, cf. Corollary 3.21, Part (a), 
and Exercise 5.) 


(c) If f € L™, then K, * f > f in the weak*-topology on L® (cf. 
Theorem 4.6); this means that [(K, * f)gdm > f fgdm for all g € 
Di. 


Miscellaneous 


7. Consider the measure space (N, P(N), 4), where p is the counting measure 
(u(E) is the number of points in £ if E is a finite subset of N and = co 
otherwise). The space I? := L?(N,P(N), 14) is the space of all complex 
sequences x := {x(n)} such that ||z||, == (0 |ax(n)|P) 7” < oo (in case 
p <0) or |z||. := sup |z(n)| < 00 (in case p = 00). As a special case of 
Theorem 4.6, if p € [1,co) and q is its conjugate exponent, then (I?)* is 
isometrically isomorphic to /4 through the map «* € (1?)* + y € 11, where 
y := {y(n)} is the unique element of /? such that 2*x = > x(n)y(n) for all 
x € 1. Prove this directly! (Hint: consider the unit vectors e,, € 1? with 
€m(N) = dn,m, the Kronecker delta.) 


128 


10. 


11. 


4. Continuous linear functionals 


Consider N with the discrete topology, and let co := Co(N) (this is the 
space of all complex sequences x := {x,} = {a(n)} with lima, =0). Asa 
special case of Exercise 2, if x* € cj, there exists a unique complex Borel 
measure jt on N such that 2*x2 = >>, x(n)u({n}). Denote y(n) = p({n}). 
Then |lyl]i = So le({r})| < l#l(N) = lull = Il2*||, that is, y € UP 
and |ly|l1 < ||2*||. The reversed inequality is trivial. This shows that 
c is isometrically isomorphic to I‘ through the map x* —> y, where 
xx =>, x(n)y(n). Prove this directly! 


Let c denote the space of all convergent complex sequences x = {x(n)} 
with pointwise operations and the supremum norm. Show that c is a 
Banach space and c* is isometrically isomorphic to /'. (Hint: given 
x* €c*, «|e, € ¢; apply Exercise 8, and note that for each x € ©, 
x — (lim x)e € co, where e(-) = 1.) 


Let (X, A, 4) be a positive measure space, g € (1, co], and p = q/(q— 1). 
Prove that for all h € L%(,1) 


I[l|q = sup 


9 


Son ff, ha 


where the supremum is taken over all finite sums with a, € Cand Ey € A 
with 0 < u(E,) < o, such that S> laz|?u(E,) < 1. (In case gq = ~, 
assume that the measure space is o-finite.) 


Describe the left Haar measure on a discrete group G (i.e., G with the 
discrete topology) and prove that G is unimodular. 


5 


Duality 


We studied in preceding chapters the conjugate space X* for various special 
normed spaces. Our purpose in Chapter 5 is to examine X* and its relationship 
to X for an arbitrary normed space X. More generally, we study continuous 
linear functionals on topological vector spaces, which are (complex) vector 
spaces together with a Hausdorff topology making the vector space operations 
continuous. 

The general results proved in this chapter are indispensable and invaluable 
to functional analysis and beyond, and in particular to operator theory. 

First and foremost is the Hahn-—Banach lemma, which is one of the “Three 
Basic Principles of Linear Functional Analysis” (see Chapter 6 for the other 
two). These principles are so fundamental and have so many consequences that 
they are often used tacitly. The Hahn—Banach lemma and its corollaries discuss 
extending linear functionals while preserving certain properties, and are often 
employed in separation arguments. 

The Hahn-Banach theorem says that a bounded linear functional on a 
subspace of a normed space X can be extended to a linear functional on X 
with the same norm. One result of this is that X can be embedded isometrically 
as a subspace of its second dual X** := (X*)*. We say that X is reflexive when 
this embedding is surjective. Examples and properties of reflexivity are discussed. 

The next theme is separation. Given two disjoint, non-empty convex sets 
M,N in a vector space X, one is looking for a linear functional f on X 
that separates M and N, namely supkRf(M) < infRf(N) (sometimes strict 
inequality is desirable). We prove separation results under suitable assumptions 
in several contexts: the general one of vector spaces; the one of topological vector 
spaces, in which f is expected to be continuous; and the most restrictive one 
of locally convex topological vector spaces, which are particularly amenable to 
separation. Local convexity means that the topology of X has a base consisting 
of convex sets. 

For a vector space X and a (separating) family of linear functionals T on 
X, the I-topology on X is the weakest topology making each functional in 


Introduction to Modern Analysis. Second Edition. Shmuel Kantorovitz and Ami Viselter, Oxford University Press. 
© Shmuel Kantorovitz and Ami Viselter (2022). DOI: 10.1093/os0/9780192849540.003.0005 


130 5. Duality 


T continuous. This topology turns X into a locally convex topological vector 
space. For a normed space X, we study in depth two distinguished cases: the 
X*-topology on X, called the weak topology, and the X-topology on X* (with 
X viewed as embedded in X**), called the weak*-topology. In particular, we 
prove two results on the properties of the norm-closed unit ball vis-a-vis the 
weak*-topology, namely Alaoglu’s and Goldstine’s theorems. 

The Krein—Milman theorem asserts that a compact convex subset of a 
locally convex topological vector space can be reconstructed from its extremal 
points. This mighty result often yields astonishingly simple proofs of otherwise 
difficult theorems because extremal points tend to have favorable properties. 
As demonstration, we provide a short proof of the Stone-Weierstrass theorem, 
which gives sufficient conditions for a subalgebra of C'(X) to be dense in C(X) 
where X is a compact Hausdorff space. In fact, we derive it from a more general 
result called Bishop’s antisymmetry theorem. Finally, Milman’s theorem about 
the “origin” of extremal points in compact convex sets is proved in Exercise 11. 

We give another, more elementary (and classical) proof of the Stone- 
Weierstrass theorem. A generalization of the Stone—Weierstrass theorem to 
locally compact Hausdorff spaces is presented in Exercise 9. 

Marcinkiewicz’s interpolation theorem is then established with the aid of 
many results from the previous chapters. 

The fixed-point theorems due to Markov-Kakutani, Hahn, and (consequently) 
Kakutani are proved. To give a small taste of the beauty and power of fixed-point 
theory we show how Kakutani’s theorem yields a short proof of the existence 
of the Haar measure for compact topological groups. En route we prove the 
Arzela—Ascoli theorem characterizing compactness in C'(X). 

The last subject is the bounded weak*-topology. We prove theorems of 
Dieudonné and Krein-Smulian. The latter says that, remarkably, if X is a 
Banach space, then for a linear functional on X* to be weak*-continuous 
it suffices that its restriction to the norm-closed unit ball of X* be weak*- 
continuous. In contrast to continuity of functionals with respect to the norm 
topology on X*, this is not at all trivial. 


5.1 The Hahn—Banach theorem 


Let X be a vector space over R. Suppose p : X — R is subadditive and 
homogeneous for non-negative scalars. A linear functional f on a subspace Y 
of X is p-dominated if f(y) < p(y) for all y € Y. The starting point of this 
section is the following: 


Lemma 5.1 (The Hahn—Banach lemma). Let f be a p-dominated linear 
functional on the subspace Y of X. Then there exists a p-dominated linear 
functional F on X such that Fly = f. 


Proof. A p-dominated extension of f is a p-dominated linear functional g on 
a subspace D(g) of X containing Y, such that gly = f. The family F of all 
p-dominated extensions of f is partially ordered by setting g < h (for g,h € F) 


5.1. The Hahn—Banach theorem 131 


if h is an extension of g. Each totally ordered subfamily Fo of F has an upper 
bound in F, namely, the functional w whose domain is the subspace D(w) := 
Ujer, P(g), and for x € D(w) (so that x € D(g) for some g € Fo), w(x) = g(2). 
Note that w is well defined, that is, its domain is indeed a subspace of X and 
the value w(x) is independent of the particular g such that « € D(g), thanks 
to the total ordering of Fo. By Zorn’s lemma, F has a maximal element F. To 
complete the proof, we wish to show that D(F’) = X. Suppose that D(F) is 
a proper subspace of X, and let then zo € X — D(F). Let Z be the subspace 
spanned by D(F’) and zo. The general element of Z has the form z = u+ azo 
with u € D(F) and a € R uniquely determined (indeed, if z = u’ + a’zp is 
another such representation with a’ 4 a, then zo = (a’ — a)~!(u—u’) € D(F), 
a contradiction; thus a’ = a, and therefore u’ = u). For any choice of \ € R, the 
functional h with domain Z, defined by h(z) = F(u) +a. is a well-defined linear 
functional such that h|p(7) = F. If we show that A can be chosen such that the 
corresponding h is p-dominated, then h € F with domain Z properly containing 
D(F), a contradiction to the maximality of the element F' of F. 
Since F' is p-dominated, we have for all u’,u” € D(F) 


F(u’) + F(u") = F(u' “e u”) < plu’ +u') 


/ 


= p([u' + 20] + [ul — z0]) < p(w’ + 20) + p(w" — 20), 


that is, 
F(u"’) — plu" — 2) < plu’ + 20) - Flu’) (uu € D(F)). 


Pick any » between the supremum of the numbers on the left-hand side and the 
infimum of the numbers on the right-hand side. Then for all u’, u” € D(F), 


F(w)+A<p(u' +29) and F(u’)—A< p(u" — 2). 


Taking u’ = u/a if a > 0 and u” = u/(-a) if a < 0 and multiplying the 
inequalities by a and —a, respectively, it follows from the homogeneity of p for 
non-negative scalars that 


F(u)+adr\ < p(u+az) (ue D(F)) 


for all real a, that is, h(z) < p(z) for all z € Z. 


Theorem 5.2 (The Hahn—Banach theorem). Let Y be a subspace of the 
normed space X, and let y* € Y*. Then there exists «* € X* such that x*|y = y* 
and ||x*|| = |ly*I|- 


Proof. Case of real scalar field: Take 
p(x) = |ly"[Illall (w € X). 
This function is subadditive and homogeneous for non-negative scalars, and 


yy Sly"yl S lly" Ilyll = p@) wey). 


132 5. Duality 


By Lemma 5.1, there exists a p-dominated linear functional F' on X such that 
Fly =y*. Thus, for alla ec X, 


F(a) < |ly*IIllell 
and 
—F (a) = F(—2) < p(-z) = lly" Ile I, 


that is, 
|F(x)| < |ly* [Illa 
This shows that F := x* € X* and ||z*|| < ||y*||. Since the reversed inequality 
is trivial for any linear extension of y*, the theorem is proved in the case of real 
scalars. 
Case of complex scalar field: Take f := Ry* in Lemma 5.1. Then f(iy) = 
Rly*(iy)] = Rliy*y] = —S(y*y), and therefore 


yy = fly)-ifly) WEY). (1) 


For p as before, the functional f is p-dominated and linear on the vector space 
Y over the field R (indeed, f(y) < |y*y| < p(y) for all y € Y). By Lemma 5.1, 
there exists a p-dominated linear functional F : X — R (over real scalars!) such 
that Fly = f. Define 


v*c:= F(x)—iF(iz) (xe X). 


By (1), z*|y = y*. Clearly, x* is additive and homogeneous for real scalars. Also, 
forallae X, 


x* (ix) = Fix) —iF(—«x) =i[F (x) —iF(ix)] =ia*2, 


and it follows that «* is homogeneous over C. 
Given x € X, write x*x = pw with p > 0 and w € C with modulus one. Then 


\a*a| = wa* a = «* (wx) = Ri[a* (we) 
= F(wx) < |ly"|Mleal| = lly"lllell- 


Thus, «* € X* with norm < |ly*|| (hence = |ly*||, since x* is an extension 
of y*). 


Corollary 5.3. Let Y be a subspace of the normed space X, and let x € X be 
such that 
d:=d(a,Y) := inf |lja— 0. 
ae oa aoe ee 


Then there exists x* € X* with ||x*|| = 1/d, such that 
x*|\y =0, ee = 1: 


Proof. Let Z be the linear span of Y and w. Since d(z,Y) > 0,a ¢ Y, so that 
the general element z € Z has the unique representation z = y+azx with ye Y 


5.1. The Hahn—Banach theorem 133 


and a € C. Define then z*z = a. This is a well-defined linear functional on 
Z,2*|y = 0, and z*a = 1. Also z* is bounded, since 


ih ix la} 
|2*|| := sup 77 
o¢zeZ |lz|| 
1 1 
= sup =e = 1/d. 
aXd;yeY Il(y + ax) /al| infyecy lx — y|| 


By the Hahn—Banach theorem, there exists «* € X* with norm = ||z*|| = 1/d 
that extends z*, whence x*|y = 0 and a*7 = 1. 


Note that if Y is a closed subspace of X, the condition d(z,Y) > 0 is 
equivalent to x ¢ Y. If Y #4 X, such an = exists, and therefore, by Corollary 5.3, 
there exists a non-zero x* € X* such that x*|y = 0. Formally: 


Corollary 5.4. Let Y £ X be a closed subspace of the normed space X. Then 
there exists a non-zero «* € X* that vanishes on Y. 


For a not necessarily closed subspace Y, we apply the last corollary to its 
closure Y (which is a closed subspace). By continuity, vanishing of 2* on Y is 
equivalent to its vanishing on Y, and we obtain, therefore, the following useful 
criterion for non-density. 


Corollary 5.5. Let Y be a subspace of the normed space X. Then Y is not 
dense in X iff there exists a non-zero x* € X* that vanishes on Y. 


For reference, we also state this criterion as a density criterion. 


Corollary 5.6. Let Y be a subspace of the normed space X. Then Y is dense 
in X iff the vanishing of an x* € X* on Y implies x* =0. 


Corollary 5.7. Let X be a normed space, and let0 Aa € X. Then there exists 
x* € X* such that ||x*|| =1 and «*x = ||x\|. In particular, X* separates points, 
that is, if x,y are distinct vectors in X, then there exists a functional x* € X* 
such that a*x A x*y. 


Proof. We take Y = {0} in Corollary 5.3. Then d(z,Y) = |la|| > 0, so that 


there exists z* € X* such that ||z*|| = 1/||a|| and z*% = 1. Let x* := |la|Iz*. 
Then x*a = ||z|| and ||x*|| = 1 as wanted. 

If x,y are distinct vectors, we apply the preceding result to the non-zero 
vector x — y; we then obtain x* € X* such that ||a*|| = 1 and a*a — a*y = 


x*(e — y) = || — yl| £0. 


Corollary 5.8. Let X be a normed space. Then for eachx € X, 


z|= sup faz. 
a*EX*;||e*||=1 


Proof. The relation being trivial for « = 0, we assume x 4 O, and apply 
Corollary 5.7 to obtain an x* with unit norm such that «*xz = ||«||. Therefore, 


134 5. Duality 


the supremum shown is >||z||. Since the reverse inequality is a consequence of 
the definition of the norm of x*, the result follows. 


Given x € X, the functional 


Kei a2 3 axe 


on X™* is linear, and Corollary 5.8 establishes that ||K2|| = ||z||. This means 
that £ := «x is a continuous linear functional on X*, that is, an element of 
(Xe) Hee, 


The map & : X — X™** is linear (since for all 2* € X*,[K(@ 4+ av’)|a* = 
“(etan’)=a*x+an*a! = [kx + aKz'|x*) and isometric (since ||ka — Ka’ || = 
||K(a —x’)|| = ||a —2’||). The isometric isomorphism k is called the canonical (or 
natural) embedding of X in the second dual X**. 

Note that X is complete iff its isometric image X := KX is complete, and 
since conjugate spaces are always complete, «X is complete iff it is a closed 
subspace of X**. Thus, a normed space is complete iff its canonical embedding 
kX is a closed subspace of X**. In case KX = X**, we say that X is reflexive. 
Our observations show in particular that a reflexive space is necessarily complete. 


5.2 Reflexivity 


Theorem 5.9. A closed subspace of a reflexive Banach space is reflexive. 


Proof. Let X be a reflexive Banach space and let Y be a closed subspace of X. 
The restriction map 
pit >a*ly (2* € X*) 


is a norm-decreasing linear map of X* into Y*. For each y** € Y**, the function 
y** ow belongs to X**; we thus have the (continuous linear) map 


Let « denote the canonical imbedding of X onto X** (recall that X is reflexive!), 
and consider the (continuous linear) map 


Ceace re cei aS 


We claim that its range Z is contained in Y. Indeed, suppose z € Z but z ¢ Y. 
Since Y is a closed subspace of X, there exists x* € X* such that «*Y = {0} 
and x*z = 1. Then 7)(a*) = 0, and since z = (K~1 0 y)(y**) for some y** € Y**, 
we have 


1 = a*z = (K2)(2") = [x(y™)](@*) = [y™ 0 ](a") = y™*(0) = 0, 


a contradiction. Thus 


5.2. Reflexivity 135 


Given y** € Y**, consider then the element 
y= [KP oxy") €Y. 


For any y* € Y*, let 2* € X* be an extension of y* (cf. Hahn—Banach theorem). 
Then 


yy") = 9 a") = Ixy 1") = [K@)I(@*) = 2°) = 9" (Y) = (kv yy"), 


where Ky denotes the canonical imbedding of Y into Y**. This shows that y** = 
ky y, so that Ky is onto, as wanted. 


Theorem 5.10. If X and Y are isomorphic Banach spaces, then X is reflexive 
if and only if Y ts reflexive. 


Proof. Let T : X — Y be an isomorphism (i.e., a linear homeomorphism). 
Assume Y reflexive; all we need to show is that X is reflexive. 

Given y* € Y* and any T € B(X,Y), the composition y* o T is a 
continuous linear functional on X, which we denote T*y*. This defines a map 
T* € B(Y*, X*), called the (Banach) adjoint of T. One verifies easily that if 
T-' € B(Y,X), then (T*)~! exists and equals (T~')*. 

For simplicity of notation, we shall use the “hat notation” (¢ and g) for 
elements of X and Y, without specifying the space in the hat symbol. 

Let «** € X** be given. Then x** 0 7* € Y**, and since Y is reflexive, there 
exists a unique y € Y such that 


v*oT* =%. 


Let 2 = T~'y. Then for all x* € X*, 


Theorem 5.11. A Banach space is reflexive iff its conjugate is reflexive. 


Proof. Let X bea reflexive Banach space, and let « be its canonical embedding 
onto X**, 

For any ¢ € (X*)** = (X**)* the map ¢ox« is a continuous linear functional 
x* € X*, and for any 2** € X**, letting 2 := «1a**, we have 


o(a™) = O(K(a)) = a*a = (Kar)(a") = 2a" = (Kx 2") (a). 


This shows that «x~ is onto, that is, X* is reflexive. 

Conversely, if X* is reflexive, then X** is reflexive by the first part of the 
proof. Since KX is a closed subspace of X**, it is reflexive by Theorem 5.9. 
Therefore, X is reflexive since it is isomorphic to «X, by Theorem 5.10. 


136 5. Duality 


We show here that Hilbert space and L?-spaces (for 1 < p < oo) are reflexive. 
A map T': X — Y between complex vector spaces is said to be conjugate- 
homogeneous if 
T(Az)=ATzx (cE X;AEC). 
An additive conjugate-homogeneous map is called a conjugate-linear map. In 


particular, we may talk of conjugate-isomorphisms. 


Lemma 5.12. If X is a Hilbert space (over C), then there exists an isometric 
conjugate-isomorphism V : X* + X, such that 


ue =(x,Vu*) (we X) (1) 
for all z* € X*. 


Proof. If «* € X*, the “Little” Riesz representation theorem (Theorem 1.37) 
asserts that there exists a unique element y € X such that «*x = (a, y) for all 
xz € X. Denote y = Va", so that V : X* > X is uniquely determined by the 
identity (1). 

It follows from (1) that V is conjugate-linear. If Va* 4 0, we have by (1) and 
Schwarz’s inequality 


_ (Ve',Ve") _ [e*(Ve")| 


|Ve"l| = a ; 
|Ve*l| |Ve*l| 
* Vit 
< |je*|| = sup AT = sup MVE! < yey, 
x#0 I||| x40 ||| 
so that ||Va*|| = ||x*|| (this is trivially true also in case Vr* = 0, since then 
x* = 0 by (1)). 


Being conjugate-linear and norm-preserving, V is isometric, hence continuous 
and injective. It is also onto, because any y € X induces the functional «* defined 
by a*x = (x,y) for all « € X, and clearly Vx* = y by the uniqueness of the 
Riesz representation. 


Theorem 5.13. Hilbert space is reflexive. 


Proof. Denote by J the conjugation operator in C. 
Given «** € X** (for X a complex Hilbert space), the map 


Jog** oVt:X3C 
is continuous and linear. Denote it by 79. Let x := Va. Then for all 2* € X*, 


éa* = a* x = (2, Va") = (Vag, V2") = (Va*, Vz) 


=(JorjoV)a* =a2"*a"*, 


that is, 2** = @. 


5.3. Separation 137 


In particular, finite dimensional spaces C” are reflexive. Also L?(y) (for 
any positive measure space (X,A, )) is reflexive. Theorem 5.14 establishes the 
reflexivity of all L?-spaces for 1 < p < ov. 


Theorem 5.14. Let (X, A, 1) be a positive measure space. Then the space L? (1) 
is reflexive for l<p<o. 


Proof. Let ¢ = p/(p — 1). Write L? := L?(u) and 


(f,9) =f taaw (fe L’,g € LL"). 
By Theorem 4.6, there exists an isometric isomorphism 
Vit he eit 
such that 


a" f=(f,Vpx") (f € L*) 


for all x* € (L?)*. 
Given x** € (L?)**, the map 2** o (V,)~! is a continuous linear functional 
on L4; therefore 
f:=V,02"* 0(V,)* € I. 


Let x* € (L?)*, and write g := V,x* € L*; we have 


f(a") = 2" f = (f,9) = [(Va)*F1(9) 


= a**(V, *g) aad a 


This shows that x#** = ie 


The theorem is false in general for p = 1 and p= oo. 
Also the space Co(X) (for a locally compact Hausdorff space X) is not 
reflexive in general. We shall not prove these facts here. 


5.3 Separation 


We now consider applications of the Hahn—Banach lemma to separation of convex 
sets in vector spaces. 

Let X be a vector space over C or R. A convex combination of vectors xp € X 
(k =1,...,n) is any vector of the form 


so (« > 0; So ax = i) 
k=1 k 


A subset kK C X is convex if it contains the convex combinations of any two 
vectors in it. 


138 5. Duality 


Equivalently, a set K C X is convex if it is invariant under the operation 
of taking convex combinations of its elements. Indeed, invariance under convex 
combinations of pairs of elements is precisely the definition of convexity. On the 
other hand, if K is convex, one can prove the said invariance by induction on 
the number n of vectors. Assuming invariance for n > 2 vectors, consider any 
convex combination z = 771} apa, of vectors x, € K. If a = Rap = 0, 
then z= p41 € K trivially. So assume a > 0; since a4; = 1 — a, we have 


n 
a 
= a> ~ tk +(1-—a)an4i € K, 
k=1 


by the induction hypothesis and the convexity of Kk. 

The intersection of a family of convex sets is clearly convex. The convex hull 
of aset M (denoted co(M)) is the intersection of all convex sets containing M. It 
is the smallest convex set containing M (“smallest” with respect to set inclusion), 
and consists of all the convex combinations of vectors in M. 

If M, N are convex subsets of X and a, 6 are scalars, then aM+N is convex. 
Also TM is convex for any linear map T : X — Y between vector spaces. 

Let M Cc X; the point x € M is an internal point of M if for each y € X, 
there exists € = e(y) such that x+ay € M for all a € C with |a| < e. Clearly, an 
internal point of M is also an internal point of any N such that MCNCX. 

Suppose 0 is an internal point of the convex set M. Then for each y € X, there 
exists € := e(y) such that y/p € M for all p > 1/e. If y/p € M, then by convexity 
y/(p/a) = (1—a@)0+a(y/p) € M for all0 < a < 1. Since p < p/a < o for such 
a, this means that y/p’ € M for all p’ > p, that is, the set {p > 0;y/p€ Mh isa 
subray of Rt that contains 1/¢. Let «(y) be the left endpoint of that ray, that is, 


K(y) = inf{p > 0;y/p € M}. 


Then K(y) < 1/e, so that 0 < K(y) < co, and K(y) < 1 for y € M (equivalently, 
if K(y) > 1, then y € M°). Ifa >0, 


atp > 0;y/p € M} = {ap; (ay)/(ap) € M}, 


and it follows that K(ay) = axk(y) (this is also trivially true for a = 0, since 
Oe M). 

If z,y € X and p > K(x),o > K(y), then x/p,y/o € M, and since M is 
convex, 


PTY Pap 9 y/o eM. 
pt+oa pto pt+o 
Hence, «(x + y) < p +o, and therefore K(a + y) < K(x) + K(y). 
We conclude that « : X — [0,0o) is a subadditive positive-homogeneous 
functional, referred to as the Minkowski functional of the convex set M. 


Lemma 5.15. Let MC X be conver with 0 internal, and let & be its Minkowski 
functional. Then K(x) <1 iff x is an internal point of M, and K(x) > 1 iff x is 
an internal point of MS. 


5.3. Separation 139 


Proof. Let x be an internal point of M, and let e(-) be as in the definition. 
Then x + €(x)a € M, and therefore 


1 
< —— <l. 
a(x) S 1+ (x) ° 

Conversely, suppose K(z) < 1. Then x = 2/1 € M. Let y € X. Since 0 
is internal for M, there exists «9 > 0 (depending on y) such that By € M for 
|8| < e€9. In particular K(egwy) < 1 for all w € C with |w| = 1. Now (with any 
a = |a|w), 

a 
K(a + ay) < K(x) + K(ay) = K(x) + 101 5 (eowy) < K(x) + Jal/eo <1 
0 
for Ja] < €, with e < [1—«(ax)]eo. Hence, «+ ay € M for |a| < ¢€, so that x is an 
internal point of M. 

If x is an internal point of M°, there exists « > 0 such that « — ex € M*. 
However, if «(7) < 1, then 1/(1—«€) > 1 > x(a), and therefore x — ex € M, 
which is a contradiction. Thus «(x) > 1. 

Conversely, suppose «(a) > 1. Let y € X, and choose €o as earlier. Then 

(e+ ay) > w(x) — Ml e(eywoy) > x(a) —lal/eo >I 
€0 
if |a| < € with € < [K(x) — lJeo. This shows that «+ ay € M*® for |a| < €, and so 
z is an internal point of M°. 


By the lemma, «(x) = 1 iff x is not internal for M and for M°; such a point 
is called a bounding point for M (or for M°). 

We shall apply the Hahn—Banach lemma (Lemma 5.1) with p = « to obtain 
the following. 


Theorem 5.16 (Separation theorem). Let M,N be disjoint non-empty 
convex sets in the vector space X (over C or R), and suppose M has an internal 
point. Then there exists a non-zero linear functional f on X such that 


sup #f(M) < inf Rf(N) 
(one says that f separates M and N). 


Proof. Suppose that the theorem is valid for vector spaces over R. If X is a 
vector space over C, we may consider it as a vector space over R, and get an 
R-linear non-zero functional ¢ : X — R such that sup ¢(M) < inf ¢(V). Setting 
f(x) := 6(x)—id(iz) as in the proof of Theorem 5.2, we obtain a non-zero C-linear 
functional on X such that sup Rf(M) = sup ¢(M) < inf @(N) = inf Rf(N), as 
wanted. 

This shows that we need to prove the theorem for vector spaces over R only. 
We may also assume that 0 is an internal point of M. Indeed, suppose the 
theorem is valid in that case. By assumption, M has an internal point x. Thus 


140 5. Duality 


for each y € X, there exists « > 0 such that «+ ay € M for all Jal < «. 
Equivalently, 0-++- ay € M —z for all such a, that is, 0 is internal for M— «x. The 
sets N — a and M — # are disjoint convex sets, and the theorem (for the special 
case 0 internal to M — x) implies the existence of a non-zero linear functional 
f such that sup f(M — 2) < inf f(N — x). Therefore, sup f(M) < inf f(V) as 
desired. 

Fix z € N and let K := M—N-+z. Then K is convex, M Cc K, and therefore 
O is an internal point of K. Let « be the Minkowski functional of K. Since M 
and N are disjoint, z ¢ K, and therefore «(z) > 1. 

Define fo : Rz > R by fo(Az) = An(z). Then fo is linear and «-dominated 
(since for A > 0, fo(Az) := AK(z) = K(Az), and for A < 0, fo(Az) < 0 < K(Az)). 
By the Hahn—Banach lemma (Lemma 5.1), there exists a «-dominated linear 
extension f : X + R of fo. Then f(z) = fo(z) = «(z) > 1, and f(x) < K(x) <1 
for all « € K. This means that f is a non-zero linear functional on X such that 


f(M) — f(N) + f(2) = f(M-N+2z)=f(K) <1 f), 
that is, f(M) < f(N). 


5.4 Topological vector spaces 

We consider next a vector space X with a Hausdorff topology such that the 

vector space operations are continuous (a topological vector space). The function 
f:(ay,a)EexX xXx [0,1] 5 ar+(l-a)jyex 


is continuous. The set M Cc X is convex iff f(M x M x [0,1]) c M. Therefore, 
by continuity of f, if MM is convex, we have 


f(M x M x [0,1]) = f(M x M x [0,1]) c f(M x M x [0,1]J) C M, 


which proves that the closure of a convex set is convex. A trivial modification of 
the proof shows that the closure of a subspace is a subspace. 

Let M° denote the interior of M. We show that for0<a<landMcX 
convex, 

aM°+(1-—a)M Cc M°. (1) 
In particular, it follows from (1) that M° is convex. 

Let « € M°, and let V bea neighborhood of x contained in M. Since addition 
and multiplication by a non-zero scalar are homeomorphisms of X onto itself, 
U := V — « isa neighborhood of 0 and y + BU is a neighborhood of y for any 
0A8€R. Therefore, if y € M, there exists yg € MN(y+U). Thus there exists 
u € U such that y = yg — Bu. Then, given a € (0,1) and choosing 6 = a/(a—1), 
we have by convexity of M 


az +(1—a)y = ax + (1—a)yg + (a — 1)6u 
=a(r+u)+ (l—a)yg € aV + (1— a)yg 
=Ve CaM+(1-a)MCM, 


5.4. Topological vector spaces 141 


where V3 is clearly open. This proves that ax + (1—a)y € M°, as wanted. 
If M° #0 and we fix x € M°, the continuity of the vector space operations 
implies that 


y= lim ar+(l-a)y (ye M), 
a->0+ 


and it follows from (1) that M° is dense in M. 

With notation as before, it follows from the continuity of multiplication by 
scalars that for any y € X, there exists « = e(y) such that ay € U for alla EC 
with |a| < ¢; thus, for these a,w + ay € x+U =V C M. This shows that 
interior points of M are internal for M. 

It follows that bounding points for M are boundary points of M. 

Conversely, if 2 € M is internal and M° # 9, pick m € M°. Then there exists 
€ > 0 such that 


(l+ea@—em=x2+e(x—m):=m' eM. 


Therefore, by (1), 
€ 1 i 
L— m+ m 
l+e l+e 
Thus internal points for M are interior points of M (when the interior is not 
empty). 

Still for M convex with non-empty interior, suppose y is a boundary point 
of M. Pick « € M°. For0 <a < lyta(e¢—y) =ar+(1—a)y € M°® by 
(1), and therefore y is not internal for M°. It is not internal for M as well (since 
internal points are interior!). Thus y is a bounding point for M. Collecting, we 
have: 


Lemma 5.17. Let M be a convex set with non-empty interior in a topological 
vector space. A point is internal (bounding) of M iff it is an interior (boundary) 
point of M. 


Lemma 5.18. Let X be a topological vector space, let M, N be non-empty subsets 
of X with M° #0. If f is a linear functional on X that separates M and N, 
then f is continuous. 


Proof. Since (Sf)(x) = —(Rf)(ix) in the case of complex vector spaces, it 
suffices to show that Rf is continuous, and this reduces the complex case to the 
real case. Let then f : X — R be linear such that sup f(M) := 6 < inf f(N). Let 
m € M° andn€ N. Let then U be asymmetric neighborhood of 0 (i.e., -U = U) 
such that m+U Cc M (if V is any 0-neighborhood such that m+V Cc M, we 
may take U = VM (—V)). Then 0 € -U =U C M —™m, and therefore, for any 
ueéeu, 
f(u) < sup f(M) — f(m) = 6— f(m) < f(n) — f(m), 

and the same inequality holds for —u. In particular (taking u = 0), f(n) — f(m) 
> 0. Pick any p > f(n) — f(m). Then f(u) < p and also —f(u) = f(—u) < p, 
that is, |f(u)| < p for all u € U. Hence, given € > 0, the 0-neighborhood (€/p)U 
is mapped by f into (—e,¢), which proves that f is continuous at 0. However, 


142 5. Duality 


continuity at 0 is equivalent to continuity for linear maps between topological 
vector spaces, as is readily seen by translation. 


Combining Lemmas 5.17 and 5.18 with Theorem 5.16, we obtain the following 
separation theorem for topological vector spaces. 


Theorem 5.19 (Separation theorem). In a topological vector space, any two 
disjoint non-empty convex sets, one of which has non-empty interior, can be 
separated by a non-zero continuous linear functional. 


If we have strict inequality in Theorem 5.16, the functional f strictly separates 
the sets M and N (it is necessarily non-zero). A strict separation theorem is 
stated below for a locally convex topological vector space (t.v.s.), that is, a t.v.s. 
whose topology has a base consisting of convex sets. 


Theorem 5.20 (Strict separation theorem). Let X be a locally convex t.v.s. 
Let M,N be non-empty disjoint convex sets in X. Suppose M is compact and N 
is closed. Then there exists a continuous linear functional on X which strictly 
separates M and N. 


Proof. Observe first that M — N is closed. Indeed, if a net {m; —n;} (mj € 
M;n; € N;i © I) converges to x € X, then since M is compact, a subnet 
{m,} converges to some m € M. By continuity of vector space operations, the 
net {nv} = {my — (my — n4)} converges to m — x, and since N is closed, 
m—x2:=ne€N. Therefore, =m—ne€ M—WN and M —N is closed. It is also 
convex. 

Since M,N are disjoint, the point 0 is in the open set (M — N)°, and since 
X is locally convex, there exists a convex neighborhood of 0, U, disjoint from 
M —N. By Theorem 5.19 (applied to the sets M — N and U), there exists a 
non-zero continuous linear functional f separating M — N and U: 


sup Rf(U) < inf Rf(M — N). 


Since f # 0, there exists y € X such that f(y) = 1. By continuity of 
multiplication by scalars, there exists « > 0 such that ey € U. Then 


e=Ff(ey) < supRf(U) < inf Rf(M — N), 
that is, Rf(n) +e < Rf(m) for all m € M and ne N. Thus, 
sup Rf(N) < sup Rf(NV) +6 < inf Rf(M). 


Taking M = {p}, we get the following: 


Corollary 5.21. Let X be a locally convex t.v.s., let N be a (non-empty) closed 
conver set in X, and p ¢ N. Then there exists a continuous linear functional f 
strictly separating p and N. In particular (with N = {q},q 4 p), the continuous 
linear functionals on X separate the points of X (i.e., if p,q are any distinct 
points of X, then there exists a continuous linear functional f on X such that 


f(p) # f(q). 


5.5. Weak topologies 143 


5.0 Weak topologies 


Here we consider topologies induced on a given vector space X by families of 
linear functionals on it. Let [ be a separating vector space of linear functionals 
on X. Equivalently, if fa = {0}, then x = 0. The T-topology of X is the weakest 
topology on X for which all f € [ are continuous. A base for this topology 
consists of all sets of the form 


N(a;A,€) = {y © X5|f(y) — f(a) < € for all f € A}, 


where « € X,A CT is finite, and e« > 0. The net {2;;2 € I} converges to x in 
the T-topology iff f(#;) > f(a) for all f € T. The vector space operations are 
T-continuous, and the sets in the basis are clearly convex, so that X with the 
I-topology (sometimes denoted Xj) is a locally convex t.v.s. Let Xj denote the 
space of all continuous linear functionals on Xp. By definition of the I-topology, 
Ic Xf. We show here that we actually have equality between these sets. 


Lemma 5.22. Let fi,...,fn,g be linear functionals on the vector space X such 
that 


() ker f; C ker g. 


i=1 
Then g € span {fi,..., fn} (= the linear span of fi,..., fn). 


Proof. Consider the linear map 
T:X3C", Txr=(fil(x),...,fn(a)) eC”. 


Define ¢ : TX > C by ¢(Tx) = g(x) (« € X). If Tx = Ty, then x —y 

€ (), ker f; C ker g, hence g(x) = g(y), which shows that ¢ is well defined. It is 
clearly linear, and has therefore an extension as a linear functional db on C”. The 
form of ¢ is b(A1, ..+;An) = 0, aA; with a; € C. In particular, for all « € X, 


g(x) = 9(Tx) = d(T2) = Pe ou fi(@). 


Theorem 5.23. Xf =T. 


Proof. It suffices to prove that if 0 4 g is a [-continuous linear functional on 
X, then g € I. Let U be the unit disc in C. If g is T-continuous, there exists 
a basic neighborhood N = N(0;fi,..-,fnje) of zero (in the T-topology) such 
that g(N) Cc U. If x € (),; ker fi; := Z, then x € N, and therefore |g(x)| < 1. 
But Z is a subspace of X, hence kx € Z for all k € N, and so klg(x)| < 1 
for all k. This shows that Z C ker g, and therefore, by Lemma 5.22, g € span 


{fi,---5fn} CT. 


The following special [-topologies are especially important: 


144 5. Duality 


(1) If X isa Banach space and X“ is its conjugate space, the X*-topology for 
X is called the weak topology for X (the usual norm topology is also called 
the strong topology, and is clearly stronger than the weak topology). 

(2) If X* is the conjugate of the Banach space X, the X-topology for X* is 
called the weak*-topology for X*. It is in general weaker than the weak 
topology (i.e., the X**-topology) on X*. The basis described earlier (in 
the case of the weak*-topology) consists of the sets 


N(ax*321,..-,0n3€) = {y* © X*; |y* xp — v*rp| < €} 
with 2* € X*,a, € X,e>0,n EN. 


A net {x7} converges weak* to x* if and only if e¥a > a2* x for alla ce X 
(this is pointwise convergence of the functions 7¥ to x* on X)). 


Theorem 5.24 (Alaoglu’s theorem). Let X be a Banach space. Then the 
(strongly) closed unit ball of X* 


S* := {a* € X*; ||x*|| < 1} 
is compact in the weak” topology. 


Proof. Let 
A(x) = {A EC; |A| < lz} (we X), 


and 
A= |] 4@) 
cEX 

with the Cartesian product topology. By Tychonoff’s theorem, A is compact. 

If f € S*, f(x) € A(z) for each « € X, so that f € A, that is, S* C A. 
Convergence in the relative A-topology on S* is pointwise convergence at all 
points x € X, and this is precisely weak*-convergence in S*. The theorem will 
then follow if we show that S* is closed in A. Suppose {fj;7 € I} is a net in 
X* converging in A to some f. This means that fi(z) > f(a) for all « € X. 
Therefore, for each x,y € X and AEC, 


f(a + ry) = lim file + Ay) = lim[fi(z) + AFi(Y)] 


= lim fi(x) + Alim fi(y) = f(x) + f(y), 


and since |f(«)| < ||z||, we conclude that f € S*. 


Theorem 5.25 (Goldstine’s theorem). Let S and S** be the strongly 
closed unit balls in X and X**, respectively, and let Kk : S — S** be 
the canonical embedding (cf. comments following Corollary 5.8). Then KS is 
weak*-dense in S**. 


Proof. Let «S denote the weak*-closure of «S. Proceeding by contradiction, 
suppose x2** € $** is not in KS. We apply Corollary 5.21 in the locally convex 


5.5. Weak topologies 145 


t.v.s. X** with the weak*-topology. There exists then a (weak*-)continuous 
linear functional F' on X** and a real number X such that 


RE(c**) > A> sup RF (4), (1) 
res 
where @ := Ka. 

The weak*-topology on X** is the [-topology on X** where I consists of 
all the linear functionals on X** of the form x2** —> ax**a*, with «* € X*. By 
Theorem 5.23, the functional F' is of this form, for a suitable 7*. We can then 
rewrite (1) as follows: 

nee > A Saupe i. (2) 
res 
For any « € S, write |2*x| = wa*x with |w| = 1. Then by (2), since wx € S 
whenever x € S and ||z**|| < 1, 


la*a| = a* (wa) = Ra* (wr) <A < Ra**a* < |a**x*| < |la* |. 


Hence 
I|z* || = sup |x* 2] < A < ||2" |, 
res 


contradiction. 


In contrast to Theorem 5.24, we have: 


Theorem 5.26. The closed unit ball S of a Banach space X is weakly compact 
if and only if X is reflexive. 


Proof. Observe that « : S — S** is continuous when S and S** are endowed 
with the (relative) weak and weak* topologies, respectively. (The net {2;}ier 
converges weakly to x in S if and only if 2*a; > «x*a for all 2* € X™%, ie., 
&;(x*) — &(x*) for all x*, which is equivalent to K(a;) > K(x) weak*. This 
shows actually that « is a homeomorphism of S' with the relative weak topology 
and «S with the relative weak* topology.) Therefore, if we assume that S' is 
weakly compact, it follows that «KS is weak*-compact. Thus «S' is weak*-closed, 
and since it is weak*-dense in S** (by Theorem 5.25), it follows that KS = S**. 
By linearity of &, we then conclude that «X = X**, so that X is reflexive. 
Conversely, if X is reflexive, «.S = S** (since « is norm-preserving). But S** 
is weak*-compact by Theorem 5.24 (applied to the conjugate space X**). Since 
« is a homeomorphism of S$ (with the weak topology) and KS (with the weak* 
topology), as we observed earlier, it follows that S' is weakly compact. 


It is natural to ask about compactness of S in the strong topology (i.e., the 
norm-topology). We have: 


Theorem 5.27. The strongly closed unit ball S of a Banach space X is strongly 
compact iff X is finite dimensional. 


Proof. If X is an n-dimensional Banach space, there exists a (linear) 
homeomorphism tT : X — C”. Then 7S is closed and bounded in C”, hence 
compact, by the Heine—Borel theorem. Therefore S$ is compact in X. 


146 5. Duality 


Conversely, if S is (strongly) compact, its open covering { B(x, 1/2);a € S} by 
balls has a finite subcovering {B(x,,1/2);k =1,...,n} (a, € S). Let Y be the 
linear span of the vectors x,. Then Y is a closed subspace of X of dimension <n. 
Suppose z* € X* vanishes on Y. Given x € S, there exists k,1 < k < n, such 
that x € B(a«,, 1/2). Then 


ja" x] = |a*(@ — xx)| < lle*l| le — eel < [le"Il/2, 


and therefore 
I|z*|| = sup |x*2| < ||a*||/2, 
res 


this shows that ||2*|| = 0, and so X = Y by Corollary 5.4. 


It follows from Theorems 5.24 and 5.27 that the weak*-topology is strictly 
weaker than the strong topology on X* when X (hence X*) is infinite 
dimensional. Similarly, by Theorems 5.26 and 5.27, the weak topology on an 
infinite-dimensional reflexive Banach space is strictly weaker than the strong 
topology. 


5.6 Extremal points 


As an application of the strict separation theorem (cf. Corollary 5.21), we shall 
prove the Krein—Milman theorem on extremal points. 
Let X be a vector space (over C or R). If ,y € X, denote 


Ty := {ar+(l—a)y;0<a< 1}. 
Let kK Cc X. A non-empty subset A C K is extremal in K if 
[c,yeK; TYAAFO] implies [x,y € Al. 


If A = {a} (a singleton) is extremal in AK’, we say that a is an extremal point of 
kx: the criterion for this is 


[x,y € K;a€ xy] implies [x =y=al. 


Trivially, any non-empty K is extremal in itself. If B is extremal in A and A is 
extremal in K, then B is extremal in K. The non-empty intersection of a family 
of extremal sets in K is an extremal set in K. 

From now on, let X be a locally convex t.v.s. and K C X. If A is a compact 
extremal set in K that contains no proper compact extremal subset, we call it a 
minimal compact extremal set in K. 


Lemma 5.28. A minimal compact extremal set A in K is a singleton. 


Proof. Suppose A contains two distinct points a,b. There exists x* € X* such 
that f := x* assumes distinct values at these points (cf. Corollary 5.21). Let 
p = min, f; since A is compact, the minimum p is attained on a non-empty 


5.6. Extremal points 147 


subset B C A, and B # A since f is not constant on A (f(a) # f(b)!). The set B 
is a closed subset of the compact set A, and is therefore compact. We complete 
the proof by showing that B is extremal in A (hence in K, contradicting the 
minimality assumption on A). Let x,y € A be such that zyN B # 0. Then there 
exists a € (0,1) such that 


p= flax+(1—a)y) = af(x) + (1 — a) f(y). (1) 


We have f(x) > p and f(y) > 9; if either inequality is strict, we get the 
contradiction p > p in (1). Hence, f(a) = f(y) = p, that is, x,y € B. 


Lemma 5.29. If K £0 is compact, then it has extremal points. 


Proof. Let A be the family of all compact extremal subsets of K. It is non- 
empty, since K € A, and partially ordered by set inclusion. If B C A is totally 
ordered, then ()8 is a non-empty compact extremal set in K, that is, belongs 
to A, and is a lower bound for 6. By Zorn’s lemma, A has a minimal element, 
which is a singleton {a} by Lemma 5.28. Thus, K has the extremal point a. 


If Ec X, its closed convex hull 0(£) is defined as the closure of its convex 
hull co(£). 


Theorem 5.30 (Krein—Milman’s theorem). Let X be a locally conver t.v.s., 
and let K C X be compact. Let E be the set of extremal points of kK. Then 
K CCo(E). 


Proof. We may assume K # @, and therefore E 4 § by Lemma 5.29. Hence 
N := 00(E) is a non-empty closed convex set. Suppose there exists « € K such 
that « ¢ N. By Corollary 5.21, there exists x* € X* such that f(x) < infy f 
(where f = Ra*). Let 


B={k€ K; f(k) = p= min fh. 


Then B is extremal in K (cf. proof of Lemma 5.28). Also B is a non-empty 
closed subset of the compact set K, hence is a non-empty compact set, and has 
therefore an extremal point b, by Lemma 5.29. Therefore, b is an extremal point 
of K, that is, be EC N. Hence 


p= f(b) 2 int f > f(@) 2 min f = p, 


contradiction. 


Corollary 5.31. (With assumptions and notation as in Theorem 5.30.) 
co(K) = co(£). 


In particular, a compact convex set in a locally convex t.v.s. is the closed convex 
hull of its extremal points. 


148 5. Duality 


Consider for example the strongly closed unit ball S* of the conjugate X* of 
a normed space X. By Theorem 5.24, it is weak*-compact and trivially convex. 
It is therefore the weak*-closed convex hull of its extremal points. 

This remark will be applied as follows. Let X be a compact Hausdorff space, 
let C(X) be the space of all complex continuous functions on X, and let M(X) 
be the space of all regular complex Borel measures on X (cf. Theorem 4.9). By 
Theorem 4.9 (for X compact!), the space M(X) is isometrically isomorphic to the 
dual space C(X)*. Its strongly closed unit ball M(X), is then weak*-compact 
(by Theorem 5.24). 

If A is any subset of C(X), let 


yi={uem(x) [ tau=0 ea}, (2) 


Clearly Y is weak*-closed, and therefore K := Y MN M(X), is weak*-compact 
and trivially convex. It follows from Corollary 5.31 that K is the weak*-closed 
convex hull of its extremal points. 

If A is a closed subspace of C(X) (A #4 C(X)), it follows from Corollary 5.4 
and Theorem 4.9 that Y 4 {0} (and K is the strongly closed unit ball of the 
Banach space Y). 


Lemma 5.32. Let S be the strongly closed unit ball of a normed space Y # {0}, 
and let a be an extremal point of S. Then ||a|| = 1. 


Proof. Since Y # {0}, there exists 0 4 y € S. Then 0 4 —y € S and 0 = 
(1/2)y + (1/2)(—y), so that 0 is not extremal for S. Therefore, a 4 0. Define 
then b = a/|la||(€ S). If |lal| < 1, write a = |la||b + (1 — |lal|)0, which is a 
proper convex combination of two elements of S distinct from a, contradicting 
the hypothesis that a is extremal. Hence, ||a|] = 1. 


With notations and hypothesis as in the paragraph preceding the lemma, 
let « € K be an extremal point of K (so that ||u|| = 1), and let E = supp|p| 
(cf. Definition 3.26). Then EF 4 @ (since ||,|| = 1!), and by Remark 4.10, 


[tem | tan (f € O(X)). (3) 
XxX E 


Lemma 5.33. Let A 4 C(X) be a closed subalgebra of C(X) containing the 
identity 1. For K as mentioned, let pp be an extremal point of K, and let E = 
supp||. If f € A is real on E, then f is constant on E. 


Proof. Assume first that f € A has range in (0, 1) over E, and consider the 
measures do = f du and dr = (1 — f) du. Write du = hd|y| with h measurable 
and |h| = 1 (cf. Theorem 1.46). By Theorem 1.47, 


djo| = |fh|d|u| = fdlul; dir} =|A— fyh| dlu|=A—f) dul, — (4) 


lo|| = : fd\ul = | fdlul, (5) 


hence 


5.6. Extremal points 149 


and similarly 
Ir = a (1 f) dll. (6) 


Therefore 


lol +I = fal H|(E) = |[e|| = 1. (7) 


Since f and 1— f do not vanish identically on the support F of ||, it follows 
from (5) and (6) and the discussion in Section 3.26 that ||o|| > 0 and ||r|| > 0. 
The measures o’ = a /||o|| and tr’ = 7/||T|| are in M(X)j, and for all g € A, 


1 
[sto = | of au=o. 
x lol] dx 


[se'= yf a-Nau=o. 


since A is an algebra. This means that o’ and 7’ are in K, and clearly 


and similarly 


w= llollo’ + IIr|I7", 


which is (by (7)) a proper convex combination. Since jz is an extremal point of 
K, it follows that = o’. Therefore 


[ 9(f- lle) au =0 
xX 


for all bounded Borel functions g on X. Choose in particular g = (f — |lo||)h. 
Then 


| (f = lla)? duel = 0, 
E 


and consequently (cf. discussion following Definition 3.26) f = ||o|| identically 
on E. 

If f € Ais real on E, there exist a € Rand 6 > 0 such that 0 < 6(f—a) <1 
on E£. Since 1 € A, the function fo := 8(f — a) belongs to A and has range 
in (0, 1). By the first part of the proof, fo is constant on FE, and therefore f is 
constant on EL. 


A non-empty subset & Cc X with the property of the conclusion of 
Lemma 5.33 (ie., any function of A that is real on E is necessarily constant 
on F) is called an antisymmetric set (for A). If x,y € X are contained in some 
antisymmetric set, they are said to be equivalent (with respect to A). This is 
an equivalence relation. Let € be the family of all equivalence classes. If E is 
antisymmetric, and EF is the equivalence class of any p € E, then EC E€ €. 


Theorem 5.34 (Bishop’s antisymmetry theorem). Let X be a compact 
Hausdorff space. Let A be a closed subalgebra of C(X) containing the constant 
function 1. Define E as described. Suppose g © C(X) has the property: 


150 5. Duality 


(B) For each E € €, there exists f € A such that g = f on E. 
Then ge A. 


Proof. We may assume that A # C(X). Suppose that g € C(X) satisfies 
Condition (B), but g ¢ A. By the Hahn—Banach theorem and the Riesz 
representation theorem, there exists y € M(X) such that fy, hdv = 0 for 
allh € A and f, gdv # 0. In particular, K 4 0 (since v € K), and is the 
weak*-closed convex hull of its extremal points. Let w be an extremal point of 
K. By Lemma 5.33, the set E := supp|j| is antisymmetric for A. Let E € € be 
as defined previously. By Condition (B), there exists f € A such that g = f on 
E, hence on supp|y1| := E C E. Therefore, 


[suf f du = 0, 
Xe x 


since up € K CY and f € A. Since K is the weak*-closed convex hull of its 
extremal points and v € K, it follows that [ x gdv = 0, contradiction! 


We say that A is selfadjoint if f € A whenever f € A. This implies of course 
that Rf and Sf are in A whenever f € A (and conversely); A separates points 
(of X) if whenever x,y € X are distinct points, there exists f € A such that 


f(x) 4 fly). 


Corollary 5.35 (Stone—Weierstrass theorem). Let X be a compact 
Hausdorff space, and let A be a closed selfadjoint subalgebra of C(X) containing 
1 and separating points of X. Then A= C(X). 


Proof. If E is antisymmetric for A and f € A, the real functions Rf, Sf € 
A are necessarily constant on E£, and therefore f is constant on EF. Since A 
separates points, it follows that FE is a singleton. Hence, equivalent points must 
coincide, and so € consists of all singletons. But then Condition (B) is trivially 
satisfied by any g € C(X): given {p} © €, choose f = g(p)1 € A, then surely 
g = f on {p}. 


5.7 The Stone—-Weierstrass theorem 


The Stone-Weierstrass theorem is one of the fundamental theorems of functional 
analysis, and it is worth giving it also an elementary proof, independent of the 
machinery developed previously in the chapter. 

Let Cr(X) denote the algebra (over R) of all real continuous functions 
on X, and let A be a subalgebra (over R). Since h(u) := u'/? € Cp((0,1]) 
and h(0) = 0, the classical Weierstrass approximation theorem establishes the 
existence of polynomials p,, without free coefficient, converging to h uniformly 
on [0, 1]. Given f € A, the function u(x) := (f(x)?/||f||?) : X > [0,1] belongs to 
A, and therefore p, ou € A converge uniformly on X to h(u(x)) = |f(x)|/I| fll. 
Hence |f| = ||| - (I/I|fl]) € A, where A denotes the closure of A in Cp(X) 
with respect to the uniform norm. 


5.7. The Stone—Weierstrass theorem 151 


If f,g € A, since 


max(f,g)=slft+g9+lf—gll; min(f,9) = 5lft+9—If—gll, 


it follows from the preceding conclusion that max(f,g) and min(f,g) belong to 
A as well. 
Formally: 


Lemma 5.36. If A is a subalgebra of Cr(X), then |f|,max(f,g) and min(f, 9) 
belong to the uniform closure A, for any f,g € A. 


Lemma 5.37. Let A be a separating subspace of Cr(X) containing 1, then for 
any distinct points 71,22 € X and any a,,Qa2 € R, there exists h € A such that 
h(tp) = an, k = 1,2. 


Proof. By hypothesis, there exists g € A such that g(a1) 4 g(x2). Take 


ha) = a1 + 7S a(n) — ol] 


We state now the Stone—Weierstrass theorem as an approximation theorem 
(real case first). 


Theorem 5.38. Let X be a compact Hausdorff space. Let A be a separating 
subalgebra of Cr(X) containing 1. Then A is dense in Cr(X). 


Proof. Let f € Cr(X) and € > 0 be given. Fix vp € X. For any x’ € X, there 
exists f’ € A such that 


f'(wo) = f(to); f(z") S f(a") +€/2 


(cf. Lemma 5.37). 

By continuity of f and f’, there exists an open neighborhood V(2’) of x’ such 
that f’ < f+e¢ on V(2’). By compactness of X, there exist r, € X,k =1,...,n, 
such that 


k=1 


Let f;, € A be the function f’ corresponding to the point x’ = x, as shown, and 
let 

g:=min(fi,..., fn). 
Then g € A by Lemma 5.36, and 


g(zo) = f(xo); g<fte on X. 


By continuity of f and g, there exists an open neighborhood W(ao) of xo 
such that 
g>f—-«€ on W(2o). 


152 5. Duality 


We now vary xo (over X). The open cover {W(xo);20 € X} of X has a finite 


subcover {W,,...,Wm}, corresponding to functions g1,...,g9m € A as shown. 
Thus 
GS fte onX 
and 
9 2f—e on W;. 
Define 


h = max(g1,.--,9m)- 
Then h € A by Lemma 5.36, and 


f-e€<h<ft+e onX. 


Therefore, f € A. 


Theorem 5.39 (The Stone—Weierstrass Theorem). Let X be a compact 
Hausdorff space, and let A be a separating selfadjoint subalgebra of C(X) 
containing 1. Then A is dense in C(X). 


Proof. Let Ar be the algebra (over R) of all real functions in A. It contains 
1. Let 21,22 be distinct points of X, and let then h € A be such that h(x,) 4 
h(a). Then either Rh(x1) A Rh(x2) or Sh(x1) F Sh(x2) (or both). Since 
Rh = (h +h)/2 € Ap (since A is selfadjoint), and similarly Sh € Ap, it follows 
that Ap is separating. 

Let f € C(X). Then Rf € Cr(X), and therefore, by Theorem 5.38, there 
exists a sequence g, € Ar converging to Rf uniformly on X. Similarly, there 
exists a sequence h,, € Ar converging to Sf uniformly on X. Then gy, +ihn € A 
converge to f uniformly on X. 


5.8 Operators between Lebesgue spaces: 
Marcinkiewicz’s interpolation theorem 


5.40. Weak and strong types. Let (X,.A, ) and (Y, B,v) be measure spaces, 
and let p,q € [1,00] (presently, q is not the conjugate exponent of p!). We 
consider operators T defined on L?(j1) such that Tf is a B-measurable function 
on Y and 


ITF +9) SITS +IT9l (Fg € L?()). (1) 


We refer to such operators T as sublinear operators. (This includes linear 
operators with range as described.) Let n be the distribution function of |Tf| 
for some fixed non-zero element f of L?(1), relative to the measure v on B, that 
is (cf. Definition 1.58) 


ny) =v(ITF > yl) (y > 9). 


5.8. Operators between Lebesgue spaces 153 


We say that T is of weak type (p,q) if there exists a constant 0 < M < oo 
independent of f and y such that 


n(y)/4 <M |fllp/y (2) 


in case q < co, and 
IIT Fllo <M IIfllp (3) 


in case q = oo. We say that T is of strong type (p,q) if there exists M (as above) 
such that 


ITF lla <M Iifile- (4) 


The concepts of weak and strong types (p,oo) coincide by definition, while 
in general strong type (p,q) implies weak type (p,q), because by (2) in 
Definition 1.58 (for g < co) 


n(y)/4 <||TFlla/y <M If llp/y- (5) 


The infimum of all M for which (2) or (4) is satisfied is called the weak or strong 
(p, q)-norm of T, respectively. By (5), the weak (p, q)-norm of T is no larger than 
its strong (p, q)-norm. 

A linear operator T is of strong type (p,q) iff T € B(L?(u), L4(v)), and in 
that case, the strong (p, q)-norm is the corresponding operator norm of T. 

In the sequel, we consider only the case q < oo. If T is of weak type 
(p,q), it follows from (2) that n is finite and n(oo) = 0, and consequently, 
by Theorem 1.59, 


ITs =4 ‘| » yt nly) dy, 6) 


where the two sides could be finite or infinite. 

Let wu > 0, and consider the decomposition f = f,, + f/, as in Technique 1.61. 
Since f,, and f’, are both in L?() (for f € L?()), the given sublinear operator 
T is defined on them, and 


ITF] <|Pful + IT Fil- (7) 


Let n, and n/, denote the distribution functions of |Tf,,| and |Tf/,|, respectively. 
Since by (7) 


ITF > y) c (IP fal > y/2U (ITF > y/2] (y > 0), 
we have 
ny) < nu(y/2) + ry (y/2). (8) 


We now assume that T is of weak types (r,s) and (a,b), with r < s anda < b, 
and s 4 b. We consider the case r # a (without loss of generality, r > a) and 
s > b. Denote the respective weak norms of T by M and N, and 


s/r:=o (21), b/a:= B (21). 


154 5. Duality 


Let a<p<rand f € L(y) such that ||f||, > 0. By Technique 1.61, f, € L"(u) 
and f/ € L*(y). Since T is of weak type (r,s) with weak (r,s)-norm M, we have 


nuly) < M*y*|| full? (9) 
Since T is of weak type (a,b) with weak (a, b)-norm N, we have 
niu(y) < Ny Falla. (10) 
By (8)-(10), 
ny) < (2M)°y~* || full? + QN)'y fille: (11) 


Let b <q < s. By Theorem 1.59 and (11), 
(/aiiT sll = i. ytnly) dy 
0 


< (2M) i y?H] full dy + (2N)! : yt PU fr edy. (12) 


Since a < p <r, we may apply Formulae (8) and (9) of Technique 1.61; we then 
conclude from (12) that 


oO 


a/pirsig seme f° ym ( [erie av) dy 


+ (2N)’a8 [ yr ot ([w —u)*!m(v) ww) dy. (13) 


In this formula, we may also take u dependent monotonically on y (integrability is 
then clear). Denote the two integrals in (13) by ® and W. Since a, 3 > 1, it follows 
from Corollary 5.8 and Theorem 4.6 (applied to the spaces L7(R*+,M, y4*~! dy) 
and L°(R*+,M,y%—°—! dy) respectively, where M is the Lebesgue o-algebra over 
Rt) that 


Bi? sup fy (fo m(v)ae) oly) a (14) 


where the supremum is taken over all measurable functions g > 0 on R* such 
that [>° y2~*~'g? (y) dy < 1 (o' denotes here the conjugate exponent of ). 
Similarly 


ws sup fo yt (w= wy tm(u)de) nty)dy, (5) 


where the supremum is taken over all measurable function h > 0 on R* such 
that ie yt! nF (y) dy < 1 (8 is the conjugate exponent of ). 

We now choose u = (y/c)*, where c,k are positive parameters to be 
determined later. By Tonelli’s theorem, the integral in (14) is equal to 


is ane) ( | si yt **g(y) av) dv. 


5.8. Operators between Lebesgue spaces 155 


By Holder’s inequality on the measure space (R*,M,y%~*~‘dy), the inner 


integral is 
oo l/o oo , 1/o' 
S (/ yr? dy) (/ yt *"g? (y) ay) 
cul/k 0 


q-s e8) 1/o 
Zz (2 ) = (5 — g)“'ela-9)/2 yla-s) ko 
~ \q-s cul/k 
Consequently 
® < (s—q) 1c? (/ yt 1+(a~5)/kom (y) av) ; (16) 
0 


Similarly, the integral in (15) is equal to 


[ Cie yt hy) [v — (v/e)8}°-* dy )m(o) de 


1/k 


ze | vm o( f ye hy) av) dv 
0) 0 
1/k 


ie) cu 1/8 cul/ 
< | v*—tm o( f yay) @ 

0 0 0 

ee) q—b ‘ 
i vem v(2 

0 i= 


k 


1/8" 
Ta wy) dv 


cul/F, 1/8 
) dv 
0 


= (q—b)-/6cla-)/8 | * ye 14 2) in (y) do, 
0 


Therefore P 
WV < (qg—5) ter? (/ yr 1+ (9-)/kB m (y) av) ; (17) 
0 


Since b < q < s, the integrals in (16) and (17) contain the terms v"~! and v*! 
respectively, with « := r+ (q—s)/ko <r and \:=a+(q—6)/k@8 > a. Recall 
that we also have a < p < r. If we can choose the parameter k so that « = A = p, 
then by Corollary 1.60 both integrals will be equal to (1/p)|| ||P. Since 6 = b/a 
and o = s/r, the unique solutions for k of the equations « = p and A = p are 


y] 


k= (af) and k= (r/s) (18) 


respectively, so that the choice of k is possible iff the two expressions in (18) 
coincide. Multiplying both expressions by p/q, this condition on p,q can be 


rearranged as 
(1/b) — (A/a) _ G/¢g) = (1/s) 


(1/a)—(1/p)  (A/p) - (1/r)’ 


156 5. Duality 


or equivalently 


(1/6) — (A/a) _ C/a) — (1/p) (19) 
(1/q)—(1/s)  (1/p) — (A/r)’ 

that is, 1/p and 1/q divide the segments [1/r,1/a] and [1/s,1/b], respectively, 

according to the same (positive, finite) ratio. Equivalently, 


=(I-)e+i;  taa-Hl+e (20) 


for some t € (0,1). 
With the choice of k, it now follows from (13), (16), and (17) that 


(1/q)IT £2 < (2M)°(r/p)?(s — ) eI FE? 
+ (2N)(a/p)? (q— b)he™ I FB". (21) 
We now choose the parameter c in the form 
c= (2M)*(2N)*|lf lp; 


with x, z,w real parameters to be determined so that the two summands on the 
right-hand side of (21) contain the same powers of M, N, and ||f||p, respectively. 
This yields to the following equations for the unknown parameters 2, z, w: 


s+(q—s)x=(q—b)a; b+(q—b)z=(q-8)z; pot(q—s)w=phB+(q—b)w. 
The unique solution is 


8 —b a—B6 
Z=—; w=p . 
s—b 


3—b’ s—b’ 
With the choice of c, the right-hand side of (21) is equal to 
[(r/p)?(s—4)~* + (a/p)? (q— 8) *] (2M) 9-1/9) (2N) PEO] Fl, (22) 


where 


a—p s—q q—b 
= B64 b — } 
y=B+(q Par s—, ar 


= t(q/a) + (1—t)(q/r) = a/p, 
by the relations (20) and 6 = b/a,o = s/r. By (20), the exponents of 2M and 2N 


in (22) are equal to (1 — t)q and tg, respectively. By (21) and (22), we conclude 
that 


ITF llq< KM*'N'|fllp, 
where 
K := 2q'!/4((r/p)*/"(s — q)~* + (a/p)’/*(q — b) 44 


does not depend on f. Thus, T is of strong type (p,q), with strong (p,q)-norm 
< KM'-*N", Note that the constant K depends only on the parameters a,r, b, s 
and p,q; it tends to oo when q approachs either 6 or s. 


5.9. Fixed points 157 


A similar argument (which we omit) yields the same conclusion in case s < b. 
The result (which we proved for r 4 a and finite b, s) is also valid when r = a and 
one or both exponents 6, s are infinite (again, we omit the details). We formalize 
our conclusion in the following: 


Theorem 5.41 (Marcinkiewicz’s interpolation theorem). Let 1 <a < 
b<cwandl<r<s<o. For0<t<1, let p,q be such that 


1 1 1 1 1 1 
S(=0241> and 2S -=)=4. (23) 


Suppose that the sublinear operator T is of weak types (r,s) and (a,b), with 
respective weak norms M and N. Then T is of strong type (p,q), with strong 
(p,q)-norm <K M1-*N¢*. 


The constant K depends only on the parameters a,b,r,s, and t. For a,b,r, s 
fixed, K = K(t) is bounded for t bounded away from the end points 0, 1. 


Corollary 5.42. Let a,b,r,s,t be as in Theorem 5.41. For any p,q € [1, oo], 
denote the strong (p,q)-norm of the sublinear operator T by ||T'\|p,q. If T is of 
strong types (r,s) and (a,b), then T is of strong type (p,q) whenever (23) is 
satisfied, and 

IT leq < KIT Ire IT ee 
with K as in Theorem 5.41. 


In particular, 


B(L"(X, A, »), L°(Y, B,v)) 0 B(L*(X, A, p), L’(Y, B,v)) 
C B(L?(X, A, p), L1(Y, B,v)). 


5.9 Fixed points 


Let X be a vector space and kK C X a convex subset. A map T: K > K is 
affine if 

T (px + (1 — p)y) = px + (1— p)Ty 
for all x,y € K and all p € [0,1]. For example, for any fixed a € X and 
S:X — X linear, the map T': X — X given by 


Tx =a+Sa (x € X) 


is affine, and if it maps K into itself, then T|« : K > K is affine as well. 
If 7 is any family of maps of K into itself, a fixed point for T is an element 
x € K such that Tx = x for all T€ T. 


Theorem 5.43 (Markov—Kakutani’s fixed point theorem). Let X be a 
t.v.s. and let K C X be a non-empty, compact, convex subset of X. Let T be 
a non-empty family of pairwise commuting, continuous, affine maps of K into 
itself. Then T has a fixed point. 


158 5. Duality 


Proof. We combine two common tools: averaging and compactness. 
For each T € J and n EN, denote 


n-1 
1 ; 
iS = Ti: KOK 


(the average of the powers T°,T",...,T"~1), where T° := J, the identity map. 
This function indeed maps into kK because K is convex. Observe: 


(a) The sets T,,& are compact for all T € 7 and n € N, because K is compact 
and T,, is continuous. 
(b) If $,7,...,.V € TF and m,n,...,r € N, then T,:--V.K C K, and 
therefore 
D:= SpTp- ++ Vr K C SmK. 


By the commutativity assumption and the same fact applied to T,,, 


Ee Sis VR CIE, 


L=V,-++SmInk CV,K. 


Hence 
OAL CS,KOT,KN...0V,K. 


We conclude from (a) and (b) that the family 
K:={T,Kk;T€T,n€eN} 


is a non-empty family of compact subsets of K, which has the finite intersection 
property. Therefore, (}K # @. Let k be any element in this intersection. Then 
k € K, and we can verify that Tk = k for all T € 7, that is, k is a fixed point 
of 7. Let U be a 0-neighborhood. Since K is compact and non-empty, K — K is 
compact and contains 0. Therefore, there exists n € N such that kK — K C nU. 
Fix such an n, and let T be any map in J. Since k € T,,K, we can write k = T,ky 
for some k; € K. By the affinity property of T, we have 


n—1 
1 : 
Tk —k =T(Ink1) — Trkt = T(= S~ T4ky) — Toh 
n 
J=0 


1 n : js ; 
=-)\ Tk, --) Tk 


1 1 1 
— k; K-K “ 
a 7k ki) € oC )cu 
Since U is an arbitrary 0-neighborhood, this proves that Tk = k (for all T € T). 


(T"k, — ki) = 


5.9. Fixed points 159 


In the next result, Kakutani’s fixed point theorem, the family 7 is a group 
G (not necessarily commutative!) of affine maps. Another additional condition 
will be equicontinuity. 


Definition 5.44. Let X be a topological space and Y be a t.v.s. Let T be a 
family of functions from X to Y and xg € X. Say that 7 is equicontinuous at 
xq if for every O-neighborhood V C Y there is a xp-neighborhood U C X such 
that U c f~!(f(ao) + V) for each f € T. Say that T is equicontinuous if it is 
equicontinuous at each point in X. 


Equicontinuity of a family at a point evidently implies continuity of each of 
the individual functions at that point. The proof of the following assertions is 
left to the reader. Part (b) says that when X is compact, equicontinuity implies 
the stronger “uniform equicontinuity” condition. 


Lemma 5.45. Let X, Y, and T be as in Definition 5.44. 


(a) Let x9 € X. Then T is equicontinuous at xo iff for every net {tataea in 
X that converges to xq we have limgca f(@a) = f(x) in Y uniformly in 
f ET, ie., for every 0-neighborhood V CY there exists ag € A such that 
for alla > ag we have f(ta) — f(x0) € V- 


(b) Suppose that X is a compact and that T is equicontinuous: 


(i) If X is a metric space, then for every 0-neighborhood V CY there is 
5 > 0 such that for every x,y © X, ifdx(x,y) < 6 then f(x)—f(y) EV 
for all f € T. Equivalently, for each nets {tahaca and {yataca in X 
such that limaea dx (La, Ya) = 0 we have limaca(f(La) — f(ya)) = 0 
uniformly in f € T. 


(ti) If X is a topological subspace of a t.v.s. X1, then for every 0- 
neighborhood V Cc Y there is a O-neighborhood U Cc X 1 such that 
for every x,y EX, ifx—y EU then f(x) — f(y) € V for all f ET. 
Equivalently, for each nets {tasaca and {yasaca in X such that 
limgeA(®a — Ya) = 0 we have limaca(f (ta) — f(Ya)) = 0 uniformly 
infeT. 


Theorem 5.46 (Kakutani’s fixed point theorem). Let X be a locally conver 
t.v.s. and let K C X be a non-empty, compact, convex set. Let T be an 
equicontinuous group of bijective affine maps of K onto itself. Then T has a 
fixed point. 


We derive the theorem from a more general one. A family of maps 7 from a 
topological space K into itself is called distal if whenever x,y € K and {Ty}a is 
a net in 7 such that lim, Ty, and lim, Tyy exist and are equal, we have x = y. 
For example, every family of linear isometries on a normed space is distal. 


Theorem 5.47 (F. Hahn’s fixed point theorem). Let X be a locally conver 
t.v.s., and let K C X be a non-empty, compact, convex subset of X. Let T be a 
non-empty distal semigroup of continuous affine maps of K into itself. Then T 
has a fixed point. 


160 5. Duality 


Proof of F. Hahn’s fixed point theorem. Let {2 be the family of all non- 
empty, compact, convex 7-invariant subsets of K, ordered by inclusion. Since 
K €Q, Q is non-empty. If Qo C Q is totally ordered, every finite intersection 
of sets of the family po is one of the sets of the family, hence non-empty. Thus, 
Qo has the finite intersection property, and since the sets of the family are non- 
empty compact sets, the intersection Ko := (]Qo is a non-empty compact set. 
Clearly, Ko C K is convex and 7-invariant, that is Kg € 2. By definition, Ko is 
a minimal element for Qo. By Zorn’s lemma, 2 contains a minimal element H. 
It suffices to prove that H is a singleton. 

Let z € H. Then H’ := 00(Tz) = C0(T2z) is a non-empty, compact, convex 
subset of H as H is compact, convex, and 7-invariant. It is also 7-invariant 
because J is a semigroup of affine continuous maps. Indeed, if u € co(Tz), 
there exist T; € T and t; € [0,1] (é = 1,...,n) such that ).,¢; = 1 and 
u = oy, tiTiz. Hence, for all T € T, Tu = Sy, i TTiz by affinity, and this 
vector belongs to co(7 z) because TT; € T for all i. This proves that co(T z) is 
T-invariant, and consequently so is H’ = to(7z) by continuity. Thus, H’ € Q, 
and the minimality of H entails that 


H’ =H, that is, co(7 2) =H. (1) 


Assume that x,y € H. Set z := $(x+y). Then z € H by convexity, so we may 
apply the previous paragraph to this z. By the Krein—Milman theorem (5.30), H 
has an extremal point h. By (1) and Milman’s theorem (Exercise 11), we must 
have h € Tz. Therefore, there is a net {Ty}q in T such that h = lim, Tyz. As H 
is compact we may assume, by passing to a subnet if necessary, that the limits 
u:= lim, 7,” and v := lim, Tyy also exist (in H). Hence, the definition of z 
and the affinity of the maps T, imply that h = $(u+v). But h is an extremal 
point of H, so h equals both u = limg Tyx and v = lim, Tay. Since T is distal, 
we infer that x7 = y. 


Proof of Kakutani’s fixed point theorem. To apply F. Hahn’s theorem, all 
we need to do is to prove that the family of maps 7 is distal. 

Suppose that z,y € K and {Ty}, is a net in 7 such that limyg 7,2 and 
lima Tay exist and are equal. Let V be a O-neighborhood of X. Since T is 
equicontinuous and K is compact, there exists a O0-neighborhood U of X such 
that for every a,b € K and T € 7, ifa—be€U then Ta — Tb € V. Since 
lima (Tn2—-Tay) = 0, there is ag € A such that for all a > ap we have Ty 2—-Tyy € 
U, thus TT,2 — TTay € V when T € T. Taking a := ao and T := T;," gives 
x—y € V. Since V was an arbitrary 0-neighborhood, we conclude that x = y. 


As an application of Kakutani’s theorem, we shall give a proof of the existence 
of the Haar invariant measure for compact groups. While the proof is not 
“constructive” as the one given in Section 4.5, and applies only to compact 
groups, it is less technical and may be more elegant. 

Preliminaries. Let G be a compact topological group, and let C(G) be the 
Banach space of all complex-valued continuous functions on G. For each s € G, 


5.9. Fixed points 161 


the left (right) translation operator L, (R,) on C(G) is defined by 


(Lsf)(t) = f(st) — ((Rsf)(t) == (ts), resp.) (t€ G). 


The dual Banach space C(G)* may be identified with the space of all complex 
regular Borel measures on G with the total variation norm, through the Riesz 
representation theorem. A common notation is 


(fm =Mf) (fe C(G), we C(G)"). 


For positive measures pt on G, |||] = uw(G); if u(G) = 1, p is called a probability 
measure. Thus, the set P of regular Borel probability measures on G is a subset 
of the closed unit ball of C(G)*. The latter is compact in the weak* topology by 
Alaoglu’s theorem (Theorem 5.24). Since P is closed in the weak* topology, it 
follows that P is a compact subset of C(G)* (endowed with the weak* topology). 
The (Dirac) point measure at the origin of G is clearly in P, so that P £4 0. The 
set P is clearly convex. The family {L,;5 € G} is a group (under composition) 
of linear operators on C(G), since for all s,t € G, 


Lely = Lis; L.Li1=L.=1, 


where e is the identity in G and J is the identity operator on C(G). Consequently, 
its family of Banach adjoints (see the proof of Theorem 5.10) 


{L538 € G}, 
consisting of the linear operators on C(G)* given by 
(f,L3u) =(Lsfiu) (fF © C(G), we C(G)", s EG), 
is a group (under composition), since for all s,t € G, 


L*L* = (IyLs)* = L* *Lt = LX =T, 


st) gg 


where now I is the identity operator on C(G)*. Notice also that both the closed 
unit ball of C(G)* and P are {L*;s € G}-invariant. 

We shall need the following well-known theorem. As preparation, suppose 
that Z is a topological space and A C Z. Recall that A is called conditionally 
(or relatively) compact if its closure A is compact. Also recall that if 7 is a metric 
space, A is called totally bounded if there exist finitely many elements a1,..., Qn 
in A such that the «balls centered at a;,i=1,...,n cover A: 


AC U B(aj,€). 


i=l 


Finally, if Z is a complete metric space, then A is conditionally compact iff it is 
totally bounded. 


Theorem 5.48 (Arzela—Ascoli). Let X be a compact topological space, and 
let ® C C(X). The following conditions are equivalent: 


162 5. Duality 


(i) ® is pointwise bounded and equicontinuous; 
(ti) ® is bounded and equicontinuous; 


(iti) ® is conditionally compact (equivalently: totally bounded). 


The “pointwise boundedness” condition means that for each x € X, the set 
{f(x); f € ®} is bounded. 


Proof. We will show that (i) => (ii) => (iii), and leave the implication (iii) ==> 
(i) to the reader. For an alternative, “sequential”, proof, see Exercise 3 in Chapter 
9. 
Assume that ® is pointwise bounded and equicontinuous. Let € > 0. For each 
x € X, the equicontinuity of ® at x provides a 0-neighborhood V,, in X such 
that 7 
sup |f(y)— F(@)l< 3 (Vy € Ve). (2) 
fee 
The family {V,;a € X} is an open cover of X. By compactness of X, there exist 
x, € X, i=1,...,n such that 


M:= max sup | f(«;)| + < <x. (3) 
7 Oe n fEe@ 3 


Let y € X. There exists i (1 <7 <n) such that y € V,,. Therefore, for all f € ®, 
we have by (2) and (3) 


IF) S [fea + 1F@) — F@a)| SM. 

Thus, ® is a subset of the closed M-ball of C(X), and in particular it is bounded. 
The closed disc D := {X € C;|A| < M} is compact in C; therefore, D” is 
compact in C”. Define 
mt: ®— D” 

by 

™(f) = (f(@1),---,f(@n)) (Ff € ®). 
As asubset of the compact set D”, 7(®) is totally bounded. Therefore, there exist 
fi;--->fm € ® such that the €/3-balls in C” centered at m(f,), kK = 1,...,m, 
cover 7(®). Thus, for f € ®, there exits k, 1 < k < m, such that 
ig 
3 
Let y € X. Let i (1 <i <n) be such that y € V;,,. For this index i, we have 


f(y) — fey) S IF(y) — Flea) + |Flwe) — fe (wa)| + [fe (@i) — fry) <€, 


|f (xi) — fr(xi)| < (Vi =1,...,n). (4) 


5.9. Fixed points 163 


by (2) and (4). This shows that ||f — fx|| < ¢ for all f € ®, that is, the eballs 
centered at f,, k =1,...,m, cover ®. 


Theorem 5.49 (Existence and uniqueness of the Haar measure). Let 
G be a compact topological group. Then there exists a unique regular Borel 
probability measure u on G that is left invariant, that is, 


[ena | fap (Wf EO(G), #€@). 
G G 


This measure jt is also right invariant, that is, 


[an af = fan (VfEC@, se), 
G G 


and satisfies the identity 
i fle) du(s) = ‘i f(s)du(s) (Wf EC(Q)). 
G G 


Proof. In applying Kakutani’s theorem, we take X to be C(G)* with the weak* 
topology. Consider the group {L*;s € G} defined in the preliminaries here. We 
verify that its restriction to the closed unit ball C(G)j of the Banach space 
C(G)* is equicontinuous (with respect to the weak* topology). Let V be the 
0-neighborhood in C(G)* (with the weak* topology) determined by « > 0 and 
fi,--->fn € C(G). We must find a 0-neighborhood U in C(G)* (with the weak* 
topology), such that 


Liiq-—p)E€V (Ws €G,Vaq,p € C(G)i, q—peU). (5) 


The finite set {f1,..., fn} defining V is equicontinuous on the compact set G. 
Thus, given 6 > 0, there exists a neighborhood N of the identity in G such that 


fi —-HOl<6 Wi=1L.-.,n) 


whenever t,u € G are such that u~tt € N. For such t,u and all s € G, 
(su)~1(st) = u—'t € N, and therefore 


\(Lsf;)(u) — (Ls fj )()| < 6 (Vs €G,j=1,...,n). 


The family ® := {L. fj; s © G, j =1,...,n} is therefore equicontinuous on G, 
and it is evidently bounded. By the Arzela—Ascoli theorem it is consequently 
totally bounded. Let then g1,...,9m € C(G) be such that the ¢/4-balls centered 
at these g; cover ®. Define U as the 0-neighborhood in C(G)* (with the weak* 
topology) determined by ¢/2 and gi,...,gm. Let g,p € C(G)}. If q—p € U, 
s€G,and1<j <n, let i (1 <i<m) be such that the ®-element L, f; belong 
to the €/4-ball in C(G) centered at g;. Then (with norms understood by their 
context): 


fj, £5(@ —p))| = |(Lafjy.a—p)| < (Let; —o,¢—p)|+|(9,.¢—d)| 


164 5. Duality 


Lsf; i 5 <2 
< ||Lofj-gilllla- pl +5 <25+5 = 


In conclusion, {L3|o(q)x; 8 € G} is een and therefore, so is T := 
{L*|p;s € G}. We have verified that the hypothesis of Kakutani’s theorem is 
satisfied by the group 7 (under composition) of affine maps, and the non-empty 
compact convex set P in C(G)* (with the weak* topology). By the theorem, T 
has a fixed point  € P. This means that for all s € G and f € C(G), 


(Lsf,u) =(f, 25m) = (fe), 


that is, 
[una | fap (WfeC@), s€@). 
G G 


This proves the existence of a left-invariant regular Borel probability measure 
on G. A similar proof shows that there exists a “right-invariant” regular Borel 
probability measure jz, on G, that is, 


| (Ref) du, = | fdp, (Vf CG), te). 
G G 


Let sj, and yt, be any regular Borel probability measures on G, which are left 
and right invariant, respectively. By Fubini’s theorem and the fact that pj, pr 
are probability measures, we have for all f € C(G), 


[10 ) duu(t) =f (f flet) dua ) dyui( t)) djir(s) 
=f setatur nytt =f (f tte aie(s)) din 


= i f(s) djur(s) 


The uniqueness of the Riesz representation implies that 4; = y,. This proves the 
uniqueness of the left-invariant (probability) measure on G, and the fact that it 
is also right invariant. Denote it by pu. 

Consider next the linear map T defined on C(G) by 


(Tf)(s):= f(s") (fe C(G), se G). 
T is an isometric isomorphism of C(G) onto itself. Let f := T*p = poT 
(e C(G)*). If f € C(G) is non-negative, (f, 4) = (Tf,u) > 0, because Tf > 0 
and p is a non-negative measure. Therefore ji is a positive measure. Also 
MG) = (1,4) = (T1, 4) = (1,4) = 1, 


where we have denoted by 1 the constant function with value 1 on G. Thus 
ji € P. Let s € G. For all f € C(G) andt eG, 


(TLsf)(t) = (Esf)(t) = f(st-*) = f((ts")") = (Tf) (ts) = (Rea TAO), 


5.10. The bounded weak*-topology 165 


proving that TL, = R,-1T. Hence, by the right invariance of pu, 
fol,=poToL,=poR,10T=poT=fi 


for all s € G, that is, f is left invariant. By the uniqueness of the left-invariant 
(regular Borel) probability measure, we conclude that = p, that is, for all 


fe CG), 


| flo) duls) = (Thy) = (AT) = (8) = | f(s) du(s). 
G G 


This completes the proof. 


5.10 The bounded weak*-topology 


Let X be a normed space. A topology stronger than the weak*-topology on X* 
is the so-called “bounded weak*-topology” (or bweak*-topology for short). It is 
defined by letting F C X* be bweak*-closed iff F'M aS* is weak*-closed for all 
a > 0 (where S* denotes the closed unit ball of X*). Note that since aS” itself is 
weak*-closed (it is even weak*-compact, by Alaoglu’s theorem!), every weak*- 
closed set F' is bweak*-closed. Thus, the bweak*-topology is indeed stronger than 
the weak*-topology (both on X*). 
Given x € X, set 
x~ := {a* © X*;|a*a| < 1}. 


If Ac X, set A™ = (),¢42~. Clearly, c~ is a weak*-open subset of X*, so 
that A~ is weak*-open for finite subsets A of X, and we have: 
Observation 1. The family 


{A~;AC X finite} 


is a base for the weak* 0-neighborhoods in X*. 

A set U C X* is bweak*-open iff US is bweak*-closed, that is, iff USM aS* is 
weak*-closed for all a > 0, that is, iff (U° N aS*)° = UU(aS*)* is weak*-open for 
all a > 0. This implies that (U U (aS*)°) N aS* (= UN a5S*) is relatively weak*- 
open in aS* for all a > 0. Conversely, if UV MN aS* relatively weak*-open in aS* 
for all a > 0, then for each a > 0, there exists a weak*-open set V, in X* such 
that UN aS* = V,0aS*. Then for all a > 0, UU(aS*)* = (Va N.aS*)U(aS*)* = 
V, U (aS*)° is weak*-open, because aS* is weak*-closed. Equivalently, U is 
bweak*-open, as explained earlier. In conclusion: 

Observation 2. A subset U C X* is bweak*-open iff UM aS* is relatively 
weak*-open in aS™ for all a > 0. 

Given A C X and a > 0, set 


Ag i= {2 € A; ||z|| > -}. 


166 5. Duality 


Trivially, Ag C A, so A~ C AY. Hence 
A™~ NaS* Cc AY NaS". 


On the other hand, if 0 4 2* € AY NaS* and x € A\Ag (ie., ||z|| < +), then 
jx*a| < ||x*|| |la|| < =" <1. This proves that «* € A~. Hence: 
Observation 3. For all AC X anda> 0, 


A~ nN aS* = A~ NaS". 


In the following, a subset A C X is called a norm-null sequence if it is the 
set of elements of a norm-null sequence in X. 


Theorem 5.50 (Dieudonné’s theorem). Let X be a normed space. The family 
{A~; AC X is a norm-null sequence} 


is an open base of neighborhoods for the bweak* -open O0-neighborhoods in X*. 


Proof. Let A be a norm-null sequence in X. Then Ag is finite for each a > 0. 
Therefore, A> is weak*-open, so that A> Ma.S* is relatively weak*-open in aS*. 
By Observations 2 and 3, A~ is bweak*-open. 

Conversely, let U be a bweak*-open 0-neighborhoods in X*. We proceed 
to define recursively a sequence of finite sets A, C X with the following 
properties: 


(i) Aj C Ag C Ag C ster 
(ii) if 2 € Anyi\An, then ||z|| < 4; 
(iii) A# A nS* CU, where 


at = {a* € X*:|a*a| < 1} (x € X) 
and A® :=(),.40* for any AC X. 


Note that 2 is weak*-closed for each x € X, and therefore A* is weak*-closed 
for every AC X. 

By Observation 2, UM S™* is relatively weak*-open in S*. Since, by 
Observation 1, {AY S*; A C X finite} is a base for the relative weak*-open 0- 
neighborhoods in S*, there exists a finite set B C X such that BYNS* C UNS*. 
Let A, := 2B. If now 2* € A? OS*, then for all x € B, we have 2x € Aj, hence, 
|x*a| = 4|x*(2a)| < 4 <1, that is, 2* € BY S* CU. Thus, A? NS* CU, so 
A, has property (iii). 

Suppose that A1, A2,...,A, have been constructed with properties (i)—(iii) 
up to index n. Assume that for all finite sets B C X with ||B|| := sup,¢p ||a|| < + 


n 
we have 


(A, U B)# O(n +1)S* Ue £9. (1) 


Since U° is bweak*-closed, (n + 1)S* MU is weak*-closed by the definition of 
the bweak*-topology. Also, (A, U B)* is weak*-closed (as observed previously), 


5.10. The bounded weak*-topology 167 


hence the sets in (1) are weak*-closed subsets of the weak*-compact set (n+1)S* 
(by Alaoglu’s theorem). Consequently, the sets in (1) are weak*-compact. They 
have the finite intersection property, because if B,,...,B, are finite subsets of 
X with ||Bi|| < 4+ for each 1 <i <r, then B :=Uj_, B; is a finite subset of X 
with ||B|| < +, and 


( (An U Bi)* 9 (n +: 1)S* NUS = (An UB)*# A (n+ 1I)S*NUS 40 


i=l 


by the hypothesis (1). It follows that the intersection of all sets in (1) is non- 
empty. In particular 


() (AnU{z})*#(m4+1)8* Ue ZO. 


veX,|lall<z 


Since (A, U{x})* = A#Na*, it follows that there exists z* € A#N(n+1)S*NU° 
such that |x*x| < 1 for all  € X with ||x|| < 4+. If ¢ € X and ||z|| < 1, 
then ||4+2|| < +, so that +|x*z| = |2*(4x)| < 1, proving that ||z*|| < n. 
Therefore, 2* € A# ON nS*NU*, contradicting (iii) in the recursion hypothesis. 
This contradiction shows that our assumption that (1) holds for each finite BC 
X with ||Bl| < + is false. That is, there exists a finite set B, C X such that 
|Bnl| < + and (A, UB,)*(n+1)S* CU. Define A,41 := A, U Bp. Evidently, 
properties (i)—(iii) are satisfied up to index n+1, and the recursive construction 
is complete. 

Define now A := U7, An. Since each A, is finite, (i) and (ii) show that A 
is a norm-null sequence. 

Finally, if «* € A, then «* € A® for all n € N. Let n > ||x*||. Then 


z* € A~NnS* cC A*# A nS* CU 


by (iii). Hence A C U, as desired. 


The meaning of Dieudonné’s theorem is that a net {7%,},<;, in X* converges 
in the bweak*-topology to z* € X™* iff for every norm-null sequence A C X, 
{vi} qe, converges to x* uniformly on A, that is, limaes (SUP 4 |(x4 — x*)(x)]) 

It follows from Dieudonné’s theorem that X* with the bweak*-topology is 
a topological vector space, which is locally convex as A~ is convex for every 
ACX. 


Theorem 5.51 (Krein-Smulian’s theorem). Let X be a Banach space. A 
linear functional on X* is bweak*-continuous iff it is weak* -continuous. 


Proof. Let f : X* > C be linear. Since the bweak*-topology is stronger than 
the weak*-topology, weak*-continuity of f trivially implies bweak*-continuity. 
Conversely, suppose that f is bweak*-continuous. Then 


U =f" ({A€ CA] < 1}) 


168 5. Duality 


is a bweak*-open neighborhood of 0 in X*, and contains therefore a basic 
neighborhood as described in Dieudonné’s theorem. That is, there exists a null 
sequence {x;;7 € N} in X such that {x;;7 € N}~ CU. Hence 


[f(@*)| <1 (Va* € {ai;i € Nf"). (2) 


Let {aie N}O c= {a* € X*;2*2; =0 for allie N}. Since {2;;ie N}> Cc 
{xi;7 € N}~, we have by (2) 


lf(a*)| <1 (Wa* € {a;;i € N}*). 


Let x* € {2;;i € N}~. For each n € N we have na* € {x;;i € N}~ , and therefore 
n|f(x*)| = |f(na*)| < 1. Hence f(«*) = 0. This proves that 


Fltessseny+ = - (3) 


Define 
T:X* +o 


by 
Ta = {a*x;;1 € N} (a* € X*). 


Ifa*,y* € X* are such that x*2; = y*x; for alli € N, then a*—y* € {a;;1 € N}-, 
and therefore f(x* — y*) = 0 by (3). Thus, if ra* = ry*, also f(a*) = f(y*). It 
follows that the map 7 

f:TX* 3C 
defined by 


F(ra*) = f(a") 

is well defined. Its linearity is clear. If ||rx*||,, < 1, then |x*x;| < 1 for alli EN, 
hence x* € {a;;1 € N}™, and therefore |f(x*)| < 1 by (2), that is, |f(ra*)| <1. 
This shows that | f | < 1. By the Hahn—Banach theorem, there exists g € c 
such that gl-x« = f and |lg|| = lf < 1. By the “description” of c§ as I’ 
(see Exercise 8 of Chapter 4), there exists a sequence {g;;i € N} € I+ such that 
G(E) = D1 9&i for all € € co and Y75*, |gil = lH{gist © N}lln = llgll-« < 1. Since 
{gi;i € N} € U and {a;;i € N} is a norm-null sequence, the series >>", givi 
converges (absolutely) in the Banach space X. Therefore, for all 2* € X%*, 


x” (doa) = Digit vi =p(re") = fra") = 7(e"). 


Letting x := >, 9:xi, we infer that f(a*) = x*x for all x* € X*, that is, 
f = «a, where & is the canonical embedding of X in X**. This proves that f is 
weak*-continuous. 


We end the discussion with two equivalent restatements of the Krein-Smulian 
theorem. The first follows from the definition of the bweak*-topology. The second 
is obtained with the aid of Corollary 5.21 of the strict separation theorem. 


Exercises 169 


Theorem 5.52. Let X be a Banach space. A linear functional on X* is weak*- 
continuous iff its restriction to S* is continuous with respect to the relative 
weak*-topology on S*. 


Theorem 5.53. Let X be a Banach space. A convex set C C X* is weak* -closed 
iff it is bweak* -closed, that is, iff CO aS* is weak*-closed for alla > 0. 


Exercises 


1. A Banach space X is separable if it contains a countable dense subset. 
Prove that if X* is separable, then X is separable (but the converse is 
false). (Hint: let {2* } be a sequence of unit vectors dense in the unit sphere 
of X*. Pick unit vectors x, in X such that |v*a,| > 1/2. Use Corollary 5.5 
to show that span {x,,} is dense in X; the same is true when the scalars 
are complex numbers with rational real and imaginary parts.) 


2. Consider the normed space 
C™(R) := {f € C.(R); f™ € C.(R),k =1,...,n}, 
with the norm 


WF = SOUP blu. 
k=0 


Given ¢ € C”(R), prove that there exist complex Borel measures pz (k = 


0,...,n) such that 
N= 3 ft” dun 
; IR 
for all f € C?(R). (Hint: consider the subspace 
Z={ffs-- fb Fe CoR)} 
of C, x +++ x C, (n+1 times). Define = on Z by w([f, f’,...,f™]) = o(f), 
cf. Exercise 3, Chapter 4.) 


3. Let X be a Banach space, and let TC X* be (norm) bounded and weak*- 
closed. Prove: 


(a) T is weak*-compact. 


(b) IfT is also convex, then it is the weak*-closed convex hull of its extremal 
points. 


4. Let X,Y be normed spaces and T € B(X,Y). Prove that T is continuous 
with respect to the weak topologies on X and Y, and T* : Y* > X* 
is continuous with respect to the weak*-topologies on Y* and X*. (Recall 
that the Banach adjoint T* of T € B(X) is defined by means of the identity 
(T*y*)a = y*(Tx),2 € X,y* € Y*.) 


170 


5. Duality 


5. Let p,q € [1,co] be conjugate exponents, and let (X,A,,) be a positive 
measure space. Let g be a complex measurable function on X such that 
lgllq < M for some constant M. Then ||fg|l1 < M||fl|p for all f € L?(w) 
(by Theorems 1.26 and 1.33). Prove the converse! 


Uniform convexity 


6. Let X be a normed space, and let B and S$ denote its closed unit ball and 
its unit sphere, respectively. We say that X is uniformly convex (u.c.) if 
for each € > 0 there exists 6 = d(€) > 0 such that ||z — y|| < € whenever 
x,y € B are such that ||(@ + y)/2|| > 1 — 6. Prove: 


(a) 


(b) 
(c) 


X is we. iff whenever 2p, Yn € S are such that ||¢7n+Yyn|| > 2, it follows 
that ||an — Yn|| 3 0. 

Every inner product space is u.c. 

Let X be a u.c. normed space and {z,} C X. Then tz, ~ x © X 
strongly iff x, — x weakly and ||z,,|| > ||a||. (Hint: suppose x, > « 
weakly and ||x,|| + ||2||. We may assume that x,,x2 € S. Pick 7} € X* 
such that x§a = 1 = ||x||, cf. Corollary 5.7.) 


The “distance theorem” (Theorem 1.35) is valid for a u.c. Banach space 
Xx. 

The following parts are steps in the proof of the result stated in Part 
(i) below. 


Let X be au.c. Banach space, and let €,6 be as in the definition above. 
Denote by S* and S** the unit spheres of X* and X**, respectively. 
Given x§* € S**, there exists xj € S* such that |x*x9 — 1| < 6. Also 
there exists « € B such that |x} — 1| < 6. Define 


Es = {x € B;|rpx — 1] < d}(4 O!). 
Show that ||2 — y|| < for all z,y € Es. 


In any normed space X, the set 
U := {a*™* © X™*; |a** x5 — 1| < 5} 
is a weak*-neighborhood of x*. 


For any weak*-neighborhood V of «5*, the weak*-neighborhood W := 
VU of x§* meets KB. (Kk denotes the canonical embedding of X in 
X**.) Thus, VN «(Es) 4 0, and therefore xj* belongs to the weak*- 
closure of «(E5) (cf. Goldstine’s theorem). 

Fix « € Es. Then 2{* € kx + «B**, where B** denotes the (norm) 
closed unit ball of X**. (Hint: apply Parts (e) and (g), and the fact 
that B** is weak*-compact, hence weak*-closed.) 


Conclude from Part (h) that d(#}*,«B) = 0, and therefore x{* € KB 
since «B is norm-closed in X** (cf. paragraph preceding Theorem 5.9). 


Exercises 171 


This proves the following theorem: uniformly convex Banach spaces are 
reflexive. 


Miscellaneous 


7. Let {6,}%2 5 € 1% be such that there exists a positive constant K for which 


N N 
3 AnBn| <K a Ant” 
n=0 : n=0 
for all ag,...,ay € Cand N = 0,1,2,.... Prove that there exists a unique 
regular complex Borel measure yz on [0,1] such that 6, = ie t” du for all 
n =0,1,2,.... Moreover ||.|| < A. Formulate and prove the converse. 


8. Prove the converse of Theorem 4.4. 


9. Prove the following generalization of the Stone-Weierstrass theorem. Let X 
be a locally compact Hausdorff space, and let A be a separating selfadjoint 
subalgebra of Co(X) that vanishes identically nowhere, that is, for each 
x € X there is f € A with f(x) #0. Then A is dense in Co(X). Hint: if X 
is compact, prove that there are0<m< M and fe Awithm<f< M, 
and use this to show that the constant function 1 belongs to the closure of 
A; now the usual Stone—Weierstrass theorem for compact Hausdorff spaces 
applies. Next, assume that X is not compact and let Y = X U {co} be the 
Alexandroff one-point compactification of X. Observe that Co(X) embeds 
in C(Y) as the subspace of functions vanishing at the point oo. Letting B be 
the image of A under this embedding, prove that the algebra BOC1 Cc C(Y) 
is dense in C(Y) and deduce that A is dense in Co(X). 


Milman’s theorem 


10. Let X be a t.v.s. and A,B C X be compact and convex. Prove that 
co(A U B) (no closure required!) is compact. 


11. Prove Milman’s theorem, which complements (and does not rely on) the 
Krein—Milman theorem. Let X be a locally convex t.v.s., and let kK C X be 
closed such that co(’) is compact. Then all extremal points of o(’) belong 
to K. Hint: assume by contradiction that xo is an extremal point of Co(K) 
that does not belong to K. Prove that there exists a convex 0-neighborhood 
U such that vo ¢ kK +U. Explain why there exist 71,...,% € K so that 
K Cc UL, (@i + U). For each 1 < i <n, denote K; := t0((x; +U)N K). 
Then K; is compact and contained in x; + U. Deduce that 


n n 


co(K) = co( U Ki) = co( U Kj). 


172 
Now, Xo being an extremal point of co(/c) implies that 


meV KcUatUckK+y, 


i=l i=l 


which is a contradiction. 


5. Duality 


6 


Bounded operators 


This chapter is devoted to the basics of bounded operator theory. We recall that 
B(X,Y) denotes the normed space of all bounded linear mappings from the 
normed space X to the normed space Y. The norm on B(X,Y) is the operator 
norm 


|| = sup Tz] = sup [Tel] = sup ||| = sup lL 
\le\|<1 IIx\|<1 I! ||=1 o¢cex lal 
The elements of B(X,Y) will be referred to as operators. 

Two theorems about B(X,Y), the Uniform Boundedness Theorem and the 
Open Mapping Theorem, are two of the “Three Basic Principles of Linear 
Functional Analysis” (the other one is the Hahn—Banach lemma, see Chapter 5). 
They are proved along with several of their consequences. The two theorems use 
so-called category arguments in their proofs, which are based on Baire’s theorem 
about complete metric spaces, with which we open this chapter. 

We mention here in particular one consequence of the open mapping 
theorem—the Closed Graph Theorem—which is an effectual tool for proving 
boundedness of linear maps between Banach spaces. 

The quotient X/M of a normed space by a closed subspace M is defined, 
and proved to be a Banach space if X is. The converse, saying that if both WM 
and X/M are Banach spaces then so is X, is given in Exercise 22. 

The chapter ends with a section on two topologies that are weaker than the 
norm topology on B(X,Y), namely, the strong operator topology and the weak 
operator topology. These are the topologies of pointwise convergence in norm, 
respectively weakly. We prove that a linear functional on B(X,Y) is continuous 
in one of these topologies iff it is continuous in the other. These topologies play an 
important role in operator theory: for instance, see the exercises on Semigroups of 
Operators in Chapters 9 and 10 as well as Chapter 12 on von Neumann algebras. 


Introduction to Modern Analysis. Second Edition. Shmuel Kantorovitz and Ami Viselter, Oxford University Press. 
© Shmuel Kantorovitz and Ami Viselter (2022). DOI: 10.1093/os0/9780192849540.003.0006 


174 6. Bounded operators 


6.1 Category 


Theorem 6.1 (Baire’s theorem). Let X be a complete metric space (with 
metric d), and let {V;} be a sequence of open dense subsets of X. Then 
V :=()2, Vi ts dense in X. 


Proof. Let U be a nonempty open subset of X. We must show that UNV #0. 

Since V; is dense, it follows that the open set UNM V, is nonempty, and we 
may then choose a closed ball B(x1,71) := {x € X;d(a,21) < m1} CUNY, with 
radius ry < 1. Let B(x, 171) := {a € X; d(x, x1) < ri} be the corresponding open 
ball. Since V2 is dense, it follows that the open set B(x1,71) M V2 is nonempty, 
and we may then choose a closed ball B(a2,r2) C B(x1,171) M V2 with radius 
ro < 1/2. Continuing inductively, we obtain a sequence of balls B(x,,,1r,) with 
1; <1/n, such that 


B(an,1n) C Bl@n-1,1n-1) AV, for n = 2,3,.... 


If i,j > n, we have x;,x; € B(tn,Tn), and therefore d(x;,2;) < 2rn, < 2/n. 
This means that {x;} is a Cauchy sequence. Since X is complete, the sequence 
converges to some x € X. For i > n, 2; € B(an, rn) (a closed set!), and therefore 
zt € B(tn,tn) C UN Vp for all n, that is, 2 CUNV. 


Definition 6.2. A subset EF of a metric space X is nowhere dense if its closure 
E has empty interior. A countable union of nowhere dense sets in X is called a 
set of (Baire’s) first category in X. A subset of X which is not of first category 
in X is said to be of (Baire’s) second category in X. 


The family of subsets of first category is closed under countable unions. 
Subsets of sets of first category are also of first category. 

Using category terminology, Baire’s theorem has the following variant form, 
which is the basis for the “category arguments” mentioned earlier. 


Theorem 6.3 (Baire’s category theorem). A complete metric space is of 
Baire’s second category in itself. 


Proof. Suppose the complete metric space X is of Baire’s first category in itself. 


Then 
X=VE 


with E; nowhere dense (i = 1,2,...). Hence X = U, E;. Taking complements, 
we see that 


Since £; are nowhere dense, the sets in the above intersection are open dense 
sets. By Baire’s theorem, the intersection is dense, which is a contradiction. 


6.2. The uniform boundedness theorem 175 


6.2 The uniform boundedness theorem 


Theorem 6.4 (The uniform boundedness theorem, general version). 
Let X,Y be normed spaces, and let T Cc B(X,Y). Suppose the subspace 


Z := {x € X; sup ||Tx|| < oo} 
TET 


is of Baire’s second category in X. Then 


sup ||Z'|| < oo. 
TET 


Proof. Denote 
r(x) := sup ||T2|| (x € Z). 
TET 


If Sy := By (0,1) is the closed unit ball in Y, then for all T € 7,Tx € r(x) Sy C 
nSy if x € Z and nis an integer > r(x). Thus, T(x/n) € Sy for all T € T, 
that is, 
x/n€e () T 'Syc=E 
HET 
for n > r(x). This shows that 
Zc\|JneE. 


Since Z is of Baire’s second category in X and nF are closed sets (by continuity 
of each T € 7), there exists n such that nE has nonempty interior. However, 
multiplication by scalars is a homeomorphism; therefore, E has nonempty 
interior £°. Thus, let By(a,6) C E. Then for all T € T, 


5TBx (0,1) = TBx (0,5) = T[Bx(a,5) — a] = TBx(a,5) —Ta 
CTE-—Tac Sy — Sy C 2Sy. 


Hence, TBx (0,1) C (2/6)Sy for all T € T. This means that 


2 
sup ||| < 5. 
TET 


If X is complete, it is of Baire’s second category in itself by Theorem 6.3. 
This implies the nontrivial part of the following. 


Corollary 6.5 (The uniform boundedness theorem). Let X be a Banach 
space, Y a normed space, and T C B(X,Y). Then the following two statements 
are equivalent 


sup ||Tz|| << co for alla eX; (i) 
TET 


sup ||T|| < co. (ii) 
TET 


176 6. Bounded operators 


Corollary 6.6. Let X be a Banach space, Y a normed space, and let {Tn}nen C 
B(X,Y) be such that 


dlimT,«:=Tx forallxe X. 


Then T € B(X,Y) and ||T|| < liminf, ||T;,|| < sup,, ||Tn|| < co. 

Proof. The linearity of T is trivial. For each x € X,sup,, ||T,2|| < oo (since 
lim, Tx exists). By Corollary 6.5, it follows that sup,, ||T;,|| := M < oo. For all 
unit vectors x € X and all n EN, ||Tnz|| < ||Tn||; therefore 


|x|] = lim ||T,,2|| < liminf||T;,|| < M, 


that is, ||Z']| < lim inf, ||T,|| <sup,, ||Tn|| < oo. 


Corollary 6.7. Let X be a normed space, and E Cc X. Then the following two 
statements are equivalent: 


sup |z*z| < oo for all x* € X™*. (1) 
cek 


sup ||| < oo. (2) 
xceH 


Proof. Let 7 := KE Cc (X*)* = B(X*,C), where « denotes the canonical 
embedding of X into X**. Then (1) is equivalent to 


sup |(Kx)x*|< oo for all a* € X", 
KLET 


and since the conjugate space X* is complete (cf. Corollary 4.5), Corollary 6.5 
shows that this is equivalent to 


sup ||K2|| < co. 
KLET 


Since « is isometric, the last statement is equivalent to (2). 


Combining Corollaries 6.7 and 6.5, we obtain Corollary 6.8. 


Corollary 6.8. Let X be a Banach space, Y a normed space, andT C B(X,Y). 
Then the following two statements are equivalent 


sup |a*Ta|<oo for alla € X,a* € X*. (3) 
TeT 


sup ||T'|| < co. (4) 
TET 


6.3. The open mapping theorem 177 


6.3 The open mapping theorem 


Lemma 1. Let X,Y be normed spaces and T € B(X,Y). Suppose the range 
TX of T is of Baire’s second category in Y, and let € > 0 be given. Then there 
exists 6 > 0 such that 

By (0,6) Cc TBx(0,€). (1) 


Moreover, one has necessarily 6 < ||T\le. 


The bar sign stands for the closure operation in Y. 


Proof. We may write 
Co 


X = (J Bx(0,ne/2), 


n=1 


and therefore 
[o.e) 


TX = |) TBx(0, ne/2). 
n=1 
Since TX is of Baire’s second category in Y, there exists n such that 
TBx(0,ne/2) has nonempty interior. Therefore, TB x(0,¢/2) has nonempty 
interior (because T is homogeneous and multiplication by n is a homeomorphism 
of Y onto itself). Let then By(a,6) be a ball contained in it. If y € By(0, 9), 
then 
a,at+y € By(a,6) C TBx(0,€/2), 


and therefore there exist sequences 


{a},}5 {ax} Cc Bx (0, €/2) 
such that 

Tx, > at+y; Trp a. 
Let x, = x, — xf. Then ||z«l| < ||x4l| + |laZl| < € and 


Tx, = Tx, —Tr > (a+y)-a=y, 


that is, {7,} C Bx(0,¢) and y € TBx(0,e€). This proves (1). 

We show finally that the relation 6 < ||T'||e follows necessarily from (1). Fix 
y € Y with |ly|| = 1 and 0 < t < 1. Since tdy € By(0,4), it follows from (1) 
that, for each k € N, there exists x, € Bx(0,e) such that ||téy — T'x,|| < 1/k. 
Therefore, 


t6 = ||tdyl| < |[toy — Trg] + ||Txx| 
<1/k+|Tlle. 


Letting k — oo and then t > 1, the conclusion follows. 


178 6. Bounded operators 


Lemma 2. Let X be a Banach space, Y a normed space, and T € B(X,Y). 
Suppose TX is of Baire’s second category in Y, and let « > 0 be given. Then 
there exists 6 > 0 such that 


By (0,6) c TBx(0,€). (2) 


Comparing the lemmas, we observe that the payoff for the added 
completeness hypothesis is the stronger conclusion (2) (instead of (1)). 


Proof. We apply Lemma 1 with €, = €/2"t!,n = 0,1,2,.... We then obtain 
6n > 0 such that 


By (0,6n) C TBx(0,€n) (n=0,1,2,...). (3) 


We shall show that (2) is satisfied with 6 := do. 
Let y € By (0,9). 
By (3) with n = 0, there exists zo € Bx(0,€o) such that 


lly — Taxol] < 61, 


that is, y— Tap € By (0,61). 
By (8) with n = 1, there exists x; € Bx (0,€1) such that 


Il(y = To) _ T2x4|| < 69. 


Proceeding inductively, we obtain a sequence {x,;n = 0,1,2,...} such that (for 
n=0,1,2,...) 
In € Bx (0, €n) (4) 
and 
lly — F (xo + +++ + 2n)|| < dn4t. (5) 
Write 
Sn =Xo +++: +X. 


Then for non-negative integers n > m 


Il$n — Small = []@m4i1-++ + nll < [lemall +--+ + [lenll 
€ € € 
< Qm+2 ae gn+1 < Qm+1? (6) 


so that {s,,} is a Cauchy sequence in X. 

Since X is complete, s := lims, exists in X. By continuity of T and of the 
norm, the left-hand side of (5) converges to ||y — Ts|| as n — oo. The right- 
hand side of (5) converges to 0 (cf. Lemma 1). Therefore, y = T's. However, by 
(6) with m = 0,||sn — so|| < €/2 for all n; hence, ||s — xol| < €/2, and by (4) 
I|s|| < ||zol] + ||s — vol] < €. This shows that y € TBx(0,€). 


Theorem 6.9 (The open mapping theorem). Let X be a Banach space, 
and T € B(X,Y) for some normed space Y. Suppose TX is of Baire’s second 
category in Y. Then T is an open mapping. 


6.4. Graphs 179 


Proof. Let V be a nonempty open subset of X, and let y € TV. Let then x € V 
be such that y = Tx. Since V is open, there exists € > 0 such that Bx(x,€) C V. 
Let 6 correspond to € as in Lemma 2. Then 
By(y, ) =yr By (0, ) Cc Tz + TBx(0, €) aa T|z on Bx(0, €)] 
=TBx(z,e) CTV. 


This shows that TV is an open set in Y. 


Corollary 6.10. Let X,Y be Banach spaces, and let T € B(X,Y) be onto. Then 
T is an open map. 


Proof. Since T is onto, its range TX = Y is a Banach space, and is therefore 
of Baire’s second category in Y by Theorem 6.3. The result then follows from 
Theorem 6.9. 


Corollary 6.11 (Continuity of the inverse). Let X,Y be Banach spaces, 
and let T € B(X,Y) be one-to-one and onto. Then T~' € B(Y,X). 


Proof. By Corollary 6.10, T is a (linear) bijective continuous open map, that 
is, a (linear) homeomorphism. This means in particular that the inverse map is 
continuous. 


Corollary 6.12. Suppose the vector space X is a Banach space under two norms 
ll: lle, & = 1,2. If there exists a constant M > 0 such that ||a||2 < M ||a\|, for 
all x € X, then there exists a constant N > 0 such that ||a||14 < N ||a|l2 for all 
cEXx. 


Norms satisfying inequalities of the form 
1 
wlelh <llele < Mlle, (we X) 
for suitable constants M,N > 0 are said to be equivalent. They induce the same 
metric topology on X. 


Proof. Let T be the identity map from the Banach space (X, || - ||1) to the 
Banach space (X, || - ||z2). Then T is bounded (by the hypothesis on the norms), 
and clearly one-to-one and onto. The result then follows from Corollary 6.11. 


6.4 Graphs 


For the next corollary, we consider the Cartesian product X x Y of two normed 
spaces, as a normed space with the usual operations and with the norm 


Iz, ylll = llell + lvl (ey) eX x ¥). 


Clearly the sequence {[%p,Yn]} is Cauchy in X x Y iff both sequences {x,,} and 
{yn} are Cauchy in X and Y, respectively, and it converges to [x,y] in X x Y if 


180 6. Bounded operators 


and only if both z, + xin X and y, — y in Y. Therefore, X x Y is complete 
if and only if both X and Y are complete. 

Let T be a linear map with domain D(T) C X and range in Y. The domain 
is a subspace of X. The graph of T is the subspace of X x Y defined by 


I(T) := {[x, T2z];2 € D(T)}. 


If ['(T) is a closed subspace of X x Y, we say that T is a closed operator. Clearly 
T is closed iff whenever {x,} C D(T) is such that x, > x and Tx, > y, then 
x € D(T) and Tx = y. 


Corollary 6.13 (The closed graph theorem). Let X,Y be Banach spaces and 
let T be a closed operator with D(T) = X and range in Y. Then T € B(X,Y). 


Proof. Let Px and Py be the projections of X x Y onto X and Y, respectively, 
restricted to the closed subspace I(T) (which is a Banach space, as a closed 
subspace of the Banach space X x Y). They are continuous, and Px is one-to-one 
and onto. By Corollary 6.11, Bee is continuous, and therefore the composition 
Pyo Pe is continuous. However, 


Py o Py'2 = Py[x,Tz]=Tz (x € X), 


that is, TJ’ is continuous. 


Remark 6.14. The proof of Lemma 2 used the completeness of X to get 
the convergence of the sequence {s,}, which is the sequence of partial sums 
of the series }> x. The point of the argument is that, if X is complete, then the 
convergence of a series )> x, in X follows from its absolute convergence (that 
is, the convergence of the series )> ||;||). This property actually characterizes 
completeness of normed spaces. 


Theorem 6.15. Let X be a normed space. Then X is complete iff absolute 
convergence implies convergence (of series in X ). 


Proof. Suppose X is complete. If > ||xx|| converges and s,, denote the partial 
sums of )° zz, then for n > m 


n 


yn 


k=m+1 


n 


< SO Ieell +0 


k=m+1 


I|Sn — Sm] = 


as m — oo, and therefore {s,,} converges in X. 
Conversely, suppose absolute convergence implies convergence of series in X. 
Let {x,} be a Cauchy sequence in X. There exists a subsequence {z,,, } such 


that 
1 
lene _ Tn, || < OK 


ok 
(cf. proof of Lemma 1.30). The series 


co 
In, + SS Gaia = Lnx) 
k=1 


6.5. Quotient space 181 


converges absolutely, and therefore converges in X. Its (p — 1)-th partial sum is 
Ln,, and so the Cauchy sequence {x,} has the convergent subsequence {ap }; it 
follows that {x,,} itself converges. 


6.5 Quotient space 


If M is a closed subspace of the normed space X, the vector space X/M is a 
normed space for the quotient norm 


I|[x]|] := dist {0, [a]} = inf ||yll, 
ye [2] 
where [a] := «+M. The properties of the norm are easily verified; the assumption 


that M is closed is needed for the implication ||[x]|| = 0 implies [a] = [0]. For 
later use, we prove the following. 


Theorem 6.16. Let M be a closed subspace of the Banach space X. Then X /M 
is a Banach space. 


Proof. Let {[x,]} be a Cauchy sequence in X/M. It has a subsequence {[z’,]} 
such that i 
I fens] = fall < saa 


(cf. proof of Lemma 1.30). Pick y; € [2] arbitrarily. Since 


inf lly — yl] = IIlz2 — zi) < 1/4, 
ye [x5] 


there exists y2 € [a4] such that ||yo — yi|| < 1/2. Assuming we found y; € [24], 
k =1,...,n, such that ||yn41 — ye|| < 1/2" for k <n —1, since 


; 1 


. i 
weit lt — voll = Mehl ~ fall < aaa 


there exists yn+i € [21,4,] such that 


2 n 
I[Yn+1 — Ynil < QntI 1/2”. 


The series 
CO 
yi + 3° (n41 — Yn) 
n=1 
converges absolutely, and therefore converges in X, since X is complete (cf. 
Theorem 6.15). Its partial sums y,, converge therefore to some y € X. Then 


Ife] — fall = Wve} — fell = Wyn — yl < lyn — yll 9. 


Since the Cauchy sequence {|x|} has the convergent subsequence {|x/,]}, it 
follows that it converges as well. 


182 6. Bounded operators 


Corollary 6.17. The quotient map a: x — [a] of the normed space X onto 
the normed space X/M (for a given closed subspace M of X) maps the open 
unit ball of X onto that of X/M. Thus, x is an open mapping, and its norm is 
exactly 1 unless M =X. 


Proof. The map 7 is a norm-decreasing linear map. Thus, denoting by SX, and 
Sy the open unit balls of X and X/M, respectively, we have 1(S3-) C SSM: 
Conversely, if ||[x]|] < 1, ie. infye{z) ||y|| <1, then there is y € [x] whose norm 
is less than 1; that is, y € SY and a(y) = [a]. This proves that SS im C m(SS). 
The other assertions are easy consequences of the equality 7(SX) = SS /M: 


6.6 Operator topologies 


The norm topology on B(X, Y) is also called the uniform operator topology. This 
terminology is motivated by the fact that a sequence {T;,} C B(X,Y) converges 
in the norm topology of B(X,Y) iff it converges (strongly) pointwise, uniformly 
on every bounded subset of X (that is, the sequence {T,7} converges strongly 
in Y, uniformly in x on any bounded subset of X). Indeed, if ||T;, — T'| — 0, 
then for any bounded set Q Cc X, 


sup |/T,2 — Ta|| < |Z, — T]| sup |||] > 0, 
xrEQ rEeQ 


so that T,,« — Tx strongly in Y, uniformly on Q. Conversely, if T,,2 converge 
strongly in Y uniformly for x in bounded subsets of X, this is true in particular 
for x in the unit ball S = Sx. Hence as n,m —> oo, 


Tin — Tim|| = sup ||(Ln — Tim)2|| = sup ||Tn2 — Tn2|| 3 0. 
rES zZESs 


If Y is complete, B(X, Y) is complete by Theorem 4.4, and therefore T,, converge 
in the norm topology of B(X,Y). 

We consider two additional topologies on B(X,Y), weaker than the uniform 
operator topology (u.o.t.). A net {Tj;7 € J} converges to T in the strong 
operator topology (s.o.t.) of B(X,Y) if Tjz — Tx strongly in Y, for each 
x € X (this is strong pointwise convergence of the functions T;!). Since the 
uniformity requirement has been dropped, this convergence is clearly weaker 
than convergence in the u.o.t. If one requires that T;2 converge weakly to Tx 
(rather than strongly!), for each z € X, one gets a still-weaker convergence 
concept, called convergence in the weak operator topology (w.o.t.). 

The s.o.t. and the w.o.t. may be defined by giving bases as follows. 


Definition 6.18. 


1. A base for the strong operator topology on B(X,Y) consists of all the sets 
of the form 


N(T, Fe) := {8 € B(X,Y):||(S — Thal] < 6,2 € Fh, 


6.6. Operator topologies 183 


where T € B(X,Y),F C X is finite, and € > 0. 


2. A base for the weak operator topology on B(X,Y) consists of all sets of 
the form 


N(T,F,A,€) :={S € B(X,Y);|y*(S —T)al <ece FF, y* € A}, 
where T € B(X,Y),F Cc X and A CY“ are finite sets, and € > 0. 


The sets N are referred to as basic neighborhoods of T in the s.o.t. (w.o.t., 
respectively). It is clear that net convergence in these topologies is precisely as 
described. 

Since the bases in Definition 6.18 consist of convex sets, it is clear that 
B(X,Y) is a locally convex topological vector space (t.v.s.) for each of the above 
topologies. We denote by B(X,Y)s.o. and B(X,Y)w.o. the t.v.s. B(X,Y) with 
the s.o.t. and the w.o.t., respectively. 


Theorem 6.19. Let X,Y be normed spaces. Then 
B(x, Be = B(X, Ag ee 


Moreover, the general form of an element g of this (common) dual is 
g(T) = ygTx, (T € B(X,Y)), 
k 


where the sum ts finite, x, € X, and yz; € Y™. 


Proof. Let g € B(X,Y)z,.. Since g(0) = 0, strong-operator continuity of g at 
zero implies the existence of € > 0 and of a finite set F = {21,...,2,}, such that 
\g(T)| < 1 for all T € N(0, F,e). Thus, the inequalities 


Tal <e (k=1,...57) (1) 
imply |g(T)| <1. 

Consider the normed space Y” with the norm ||[y1,.--,Yn]|| := 32, |lys||- One 
verifies easily that (Y”)* is isomorphic to (Y*)”: given T € (Y”)*, there exists 
a unique vector [y7,...,y*] € (Y*)” such that 

k 


for all [y1,.--, Yn] € Y”. 
With 21,..., 2, as in (1), define the linear map 


®: B(X,Y) > Y" 


by 
®(T) =(Tx1,...,Ttn] (T € B(X,Y)). 


184 6. Bounded operators 


On the range of ® (a subspace of Y"!), define T' by 
TN(O(T)) = 97) (Te B(X,Y)). 


If T,S € B(X,Y) are such that ®(T) = ®(5), then ®(m(T — S)) = 0, so that 
m(T — S) satisfies (1) for all m € N. Hence, m|g(T — S)| = |g(m(T — S))| < 1 
for all m, and therefore g(T) = g(S). This shows that Tis well defined. It is 
clearly a linear functional on range(®). If ||®(T)|| < 1, then ||(eL)a%|| < € for all 
k, hence, |g(eT)| < 1, that is, |[(®(T))|(= |g(T)|) < 1/e. This shows that T is 
bounded, with norm < 1/e. By the Hahn—Banach theorem, I has an extension 
as an element I € (Y")*. As observed, it follows that there exist Yio. UR Ee Y* 
such that I'([y1,---,Yn]) = ©, yeye- In particular, 


gO) SUT rise Pel Os ee (3) 
k 


for all T € B(X,Y). 

In particular, this representation shows that g is continuous with respect to 
the w.o.t. Since (linear) functionals continuous with respect to the w.o.t. are 
trivially continuous with respect to the s.o.t., the theorem follows. 


Corollary 6.20. A convex subset of B(X,Y) has the same closure in the w.o.t. 
and in the s.o.t. 


Proof. Let K C B(X,Y) be convex (nonempty, without loss of generality), 
and denote by K, and Ky, its closures with respect to the s.o.t. and the w.o.t., 
respectively. Since the w.o.t. is weaker than the s.o.t., we clearly have Kk, C Ky. 
Suppose there exists T € Ky such that T ¢ K,. By Corollary 5.21, there exists 
f € B(X,Y)*, such that 


RA(T) < inf RY(S). (4) 


By Theorem 6.19, f ¢ B(X,Y)*,. Since T € Ky and K C Kg, it follows that 
infgex, Rf(S) < infsex Rf(S) < Rf (L), which is a contradiction. 


Exercises 


1. Let X be a Banach space, Y a normed space, and T € B(X,Y). Prove 
that if TX AY, then TX is of Baire’s first category in Y. 


2. Let X,Y be normed spaces, and T € B(X,Y). Prove that 
[Z'l| = sup{ly*Tx|;2 € X,y* € ¥*, |x|] = |ly"|| = 1. 


3. Let (S,A,j) be a positive measure space, and let p,q € [l,co] be 
conjugate exponents. Let T : L?() > L?(1). Prove that 


Ir =sw {| [ens aul € 1P(u),9 € LH), Ifllp = llglla = i}. 


Exercises 185 


(In case p = 1 or p = 00, assume that the measure space is o-finite.) 


4. Let X be a Banach space, {Y,;a € I} a family of normed spaces, and 
Ty € B(X,Ya), (a € I). Define 


Z= {2 € X;sup ||T.2|| = oo} : 
ael 


Prove that Z is either empty or a dense G5 in X. 


5. Let X be a Banach space, Y be a normed space, and T: D(T) Cc X — Y 
be a closed operator with range R(T) of the second category in Y. Prove: 


(a) T is an open mapping and R(T) = Y. Hint: either use the fact that 
the graph of T is a Banach space or adapt the proof of Theorem 6.9. 


(b) There exists a constant c > 0 such that, for each y € Y, there exists 
x € D(T) such that y = Tz and ||z|| < cllyl]. 


(c) If T is one-to-one, then T~! is bounded. 


6. Let m denote Lebesgue measure on the interval [0, 1], andlet l1<p<r< 
oo. Prove that the identity map of L"(m) into L?(m) is norm decreasing 
with range of Baire’s first category in L?(m). 


7. Let X be a Banach space, and let T : D(T) C X > X be a linear 
operator. Suppose there exists a € C such that (af — T)~! € B(X). 
Let p(A) = >> cxA* be any polynomial (over C) of degree n > 1. Prove 
that the operator p(T’) := >> c.T* (with domain D(T”)) is closed. (Hint: 
induction on n. Write p(A) = (A— a)q(A) +r, where the constant r may 
be assumed to be zero, without loss of generality, and q is a polynomial 
of degree n — 1.) 


8. Let X,Y be Banach spaces. The operator T € B(X,Y) is compact if 
the set TBx is conditionally compact in Y (where Bx denotes here the 
closed unit ball of X). Let K(X,Y) be the set of all compact operators 
in B(X,Y). Prove: 

(a) K(X,Y) is a (norm-)closed subspace of B(X,Y). 
(b) If Z is a Banach space, then 


K(X,Y)B(Z,X) C K(Z,Y) and B(Y,Z)K(X,Y) Cc K(X, Z). 


In particular, K(X) := K(X, X) is a closed two-sided ideal in B(X). 
(c) T € B(X,Y) is a finite rank operator if its range TX is finite 
dimensional. Prove that every finite rank operator is compact. 


(d) If Y is a Hilbert space, then the subspace of B(X,Y) consisting of 
finite rank operators is dense in K(X,Y). 


186 6. Bounded operators 


Adjoints 


9. Let X,Y be Banach spaces, and let J’: X — Y bea linear operator with 
domain D(T) Cc X and range R(T). If T is one-to-one, the inverse map 
T~' is a linear operator with domain R(T) and range D(T). 

If D(T) is dense in X, the (Banach) adjoint T* of T is defined as 
follows: 


D(T*) = {y* € Y*;y* oT is continuous on D(T)}. 


Since D(T) is dense in X, it follows that for each y* € D(T*) there exists 
a unique extension 2* € X* of y* oT (cf. Exercise 1, Chapter 4); we set 
x* = T*y*. Thus, T* is uniquely defined on D(T*) by the relation 


(T*y*)c=y"(Tx) (@ € D(T)). 
Prove: 


(a) T* is closed. If T is closed, D(T*) is weak*-dense in Y“*, and if Y is 
reflexive, D(T*) is strongly dense in Y*. 

(b) If T € B(X,Y), then T* € B(Y*, X*), and ||T*|| = ||T|. If S,T € 
B(X,Y), then (aS + 8T)* = aS* + BT* for alla,GBe CC. ifTe 
B(X,Y) and S € B(Y, Z), then (ST)* = T*S*. 

(c) If T € B(X,Y), then T** := (T*)* € B(X**,Y**), T**|x =T, and 
||7**|| = ||T'||. In particular, if X is reflexive, then T** = T (note that 
«KX is identified with X). 

(d) If T € B(X,Y), then T* is continuous with respect to the weak*- 
topologies on Y* and X* (cf. Exercise 4, Chapter 5). Conversely, if 
S € B(Y*, X*) is continuous with respect to the weak*-topologies on 
Y* and X*, then S = T* for some T € B(X,Y). Hint: given x € X, 
consider the functional ¢,(y*) = (Sy*)a on Y*. 

(e) R(T) = (\{ker(y*); y* € ker(T*)}. In particular, T* is one-to-one iff 
R(T) is dense in Y. 

(f) Let 2* € X* and M > 0 be given. Then there exists y* € D(T*) with 
|y*|| < M such that «* = T*y* iff 


\a*a| << M||Ta|| (@ € D(T)). 
In particular, x* € R(T*) iff 


| BS 


xv*a| 


sup 
xe€D(T), Tx40 |Pz|| 


(Hint: Hahn—Banach.) 


(g) Let T € B(X,Y) and let S* be the (norm-)closed unit ball of Y*. 
Then 7T*S* is weak*-compact. 


Exercises 187 


(h) Let T € B(X,Y) have closed range TX. Suppose «* € X* vanishes 
on ker(T’). Show that the map ¢: TX — C defined by ¢(Tx) = x*ax 
is a well-defined continuous linear functional, and therefore there 
exists y* € Y* such that 6 = y*|rx. (Hint: apply Corollary 6.10 
to T € B(X,TX) to conclude that there exists r > 0 such that 
{y © TX; |ly|| <r} C TBx(0,1), and deduce that ||¢|| < (1/r)||x*]|.) 


(i) With T as in Part (h), prove that 
T*Y* = {a* © X*; ker(T) C ker(a*)}. 
In particular, T* has (norm-)closed range in X™*. 


10. Let X be a Banach space, and let T be a one-to-one linear operator with 
domain and range dense in X. Prove that (T*)~1 = (T~1)*, and T~! is 
bounded (on its domain) iff (T*)~' € B(X*). 


11. Let T : D(T) Cc X — X have dense domain in the Banach space X. 
Prove: 


(a) If the range R(T*) of T* is weak*-dense in X*, then T is one-to-one. 
(b) T~+* exists and is bounded (on its domain) iff R(T*) = X*. 


12. Let X be a Banach space, and T € B(X). We say that T is bounded below 
if 
||Z'2| 


i > 0. 
oA#eeX |2r|| 


Prove: 


(a) If T is bounded below, then it is one-to-one and has closed range. 
(b) T is non-singular (that is, invertible in B(X)) iff it is bounded below 
and 7™ is one-to-one. 


Hilbert adjoint 
13. Let X be a Hilbert space, and T : D(T) Cc X > X a linear operator with 
dense domain. The Hilbert adjoint T* of T is defined in a way analogous 
to that of Exercise 9, through the Riesz representation: 
D(T*) := {y € X;a > (Ta, y) is continuous on D(T)}. 


Since D(T) is dense, given y € D(T*), there exists a unique vector in X, 
which we denote by T*y, such that 


(Tx,y) =(#,T*y) (x € D(T)). 


188 


14. 


6. Bounded operators 


Prove: 
(a) If T € B(X), then T* € B(X), ||T*|| = ||T|], T** = T, and 
(aT)* = aT* for alla €C. Also I* =I. 
(b) If S,7 € B(X), then (S§+7T)* = S* +T* and (ST)* =T*S*. 
(c) T € B(X) is called a normal operator if T*T = TT*. Prove that T 
is normal iff 
(T"x,T"y) = (Tx,Ty) (x,y € X) (1) 
(d) If T € B(X) is normal, then ||T*2|| = ||Tz|| and ||T*Tz|| = ||T?z|| 
for all x € X. Conclude that ||7*T'|| = ||T?|| and ||T?|| = ||T||?. (Hint: 
apply (1).) 
Let X be a Hilbert space, and T: D(T) Cc X — X bea linear operator. 
T is symmetric if (Tx, y) = (x, Ty) for all x,y € D(T). Prove that if T is 
symmetric and everywhere defined, then T € B(X) and T = T*. (Hint: 
Corollary 6.13.) 


Miscellaneous 


15. 


16. 


17. 


18. 


Let X be a Hilbert space, and B: X x X > C be a sesquilinear form 
such that 


|B(x,y)| < Mla |lyl| and B(x, x) > mlla||? 


for all x,y € X, for some constants M < co and m > 0. Prove that there 
exists a unique nonsingular T € B(X) such that B(x, y) = (x, Ty) for all 
x,y € X. Moreover, 


|T|| <M and |T~*|| <1/m. 


(This is the Lax—Milgram theorem.) Hint: apply Theorem 1.37 to get T; 
show that R(T) is closed and dense (cf. Theorem 1.36), and apply 
Corollary 6.11. 


Let X,Y be normed spaces, and T’: X — Y be linear. Prove that T is 
an open map iff TBx (0,1) contains By(0,r) for some r > 0. When this 
is the case, T is onto. 


Let X be a Banach space, Y a normed space, and T € B(X,Y). Suppose 
the closure of TBx(0,1) contains some ball By-(0,r). Prove that T is 
open. (Hint: adapt the proof of Lemma 2 in the proof of Theorem 6.9, 
and use Exercise 16.) 


Let X be a Banach space, and let P € B(X) be such that P? = P. Such 
an operator is called a projection. Verify: 


(a) I — P is a projection (called the complementary projection). 


Exercises 189 


19. 


20. 


21. 


(b) The ranges PX and (I — P)X are closed subspaces such that X = 
PX ® (I — P)X. Moreover PX = ker(I — P) = {x;Px = x} and 
(I — P)X =kerP. 

(c) Conversely, if Y, Z are closed subspaces of X such that X = Y @ Z 
(“complementary subspaces”), and P : X -+ Y is defined by 
P(y+z)=y for all y € Y,z © Z, then P is a projection with 
PX =Y and ker P = Z. (Hint: Corollary 6.13.) 

(d) If Y, Z are closed subspaces of X such that YN Z = {0}, then Y + Z 
is closed iff there exists a positive constant c such that |ly|| < clly+2|| 
for ally € Y and z € Z. 


Let X,Y be Banach spaces, and let {Tn}nen C B(X,Y) be Cauchy in 
the s.o.t. (that is, {T,,2} is Cauchy for each x € X). Prove that {T,,} is 
convergent in B(X,Y) in the s.o.t. 


Let X,Y be Banach spaces and T € B(X,Y). Prove that T is one- 
to-one with closed range iff there exists a positive constant c such that 
\|T'x|| > clla|| for all a € X. In that case, T~' € B(TX,X). 


Let X be a Banach space, and let C € B(X) be a contraction, that is, 
|C| < 1. Prove: 


(a) e(C-) (defined by means of the usual series) is a contraction for all 
t>0. 

(b) ||C™x — a|| < m||Ca — a|| for allm € Nand ae X. 

(c) Let Qn := e™(C—-D — C” (n EN). Then 


Co 


lQnzl| <e7" SU (n*/R IC — a (2) 


k=0 


for alln € Nand x € X. (Hint: note that O"x =e"" >, (n*/k!)C" a; 
break the ensuing series for @,x into series over k < n and over 
k>n). 

Qnall < Vesoe "(n*/kl)|k — nl ||Cx — al]. 

|Qn2z|| < /n||(C — Dall for all n € N and x € X. Hint: consider the 
Poisson probability measure ps (with “parameter” n) on P(N), defined 
by u({k}) =e7"n* /k!; apply Schwarz’s inequality in L?(y) and Part 
(d) to get the inequality 


Qna|| < || — nllz2qyl|Cx — xl] = Vnl||C2 — al). (3) 


(f) Let F : [0,co) + B(X) be contraction-valued. For t > 0 fixed, set 
Ay := (n/t)[F(t/n) — I], n € N. Suppose sup, || Anz|| < oo for all x 
in a dense subspace D of X. Then 


St. 
o 
Ee 


Jim, lle'4vx — F(t/n)"2|| = 0 (4) 


190 


22. 


23. 


24. 


25. 


26. 


6. Bounded operators 


for all t > 0 and « € X. Hint: by Part (a), ||e“4”|| < 1, and therefore 
\|e*An — F(t/n)"|| < 2. By Part (e) with C = F(t/n), the limit in (4) 
is 0 for all x € D. 


Let M be a closed subspace of a normed space X. Theorem 6.16 shows 
that if X is a Banach space, then so is the quotient space X/M. 
Conversely, show that if both M and the quotient space X /M are Banach 
spaces, then so is X. 


Let X be a normed space and M Cc X. The annihilator of M in X* is 
M+ := {a* € X*;a*(M) = {0}}. 


Prove that M+ is a weak*-closed subspace of X* and M+ = (Span M)+. 
Assuming that M is a closed subspace of X, prove: 


(a) The map that takes «* + M+ to 2*|y, x* € X*, is an isometric 
isomorphism from X*/M+ onto M*. 


(b) The map that takes x* € M+ to the functional on X/M given by 
a+M + «*x, « € X, is an isometric isomorphism from M+ onto 
(X/M)*. 

(c) M=(M+),, where (-), is defined in the next exercise. 


(d) The map that takes x + M to «(x)|,yy1, «© € X, is an isometric 
isomorphism from X/M onto its image {k(x)|)ji1;2 € X} inside 
(M+)*. Here « is the canonical embedding of X into X**. 


Let X be a normed space and N Cc X*. The annihilator of N in X is 
N . := {x € X;x*a4 =0 for all 2* € N}. 


Prove that N, is a closed subspace of X and N, = (Span”’°**” N),. 
Assuming that N is a weak*-closed subspace of X*, prove: 


(a) N=(Ni)*. 
(b) There is a canonical isometric isomorphism from N onto (X/N_)*. 


(c) There is a canonical isometric isomorphism from X/N, onto the 
subspace {k(x)|y3 a2 € X} of N*. 


(d) The assertion of Part (c) actually holds for every subspace N of X%*, 
even when it is not weak*-closed. 


Let X,Y be normed spaces. Prove that the operator norm function from 
B(X,Y) to [0,0o) is lower semi-continuous when B(X,Y) is equipped 
with the w.o.t. (and thus also when equipped with the s.o.t.). For this, 
recall Exercise 6 in Chapter 3. 


(Compare Exercise 8.) Let X,Y be Banach spaces. An operator T € 
B(X,Y) is called weakly compact if the set TBx is conditionally weakly 


Exercises 191 


compact in Y, that is, conditionally compact in the weak topology on Y. 
Prove: 


(a) An operator T € B(X,Y) is weakly compact iff the image of T** 
: X** — Y** is contained in Y (where we identify Y with KY). 
Hint: use Exercise 9 and some results from Section 5.5. 

(b) If either X or Y is reflexive, then all elements of B(X, Y) are weakly 
compact. 


(c) The set of all weakly compact operators in B(X,Y) is a closed 
subspace of B(X,Y). 


(d) If Z is another Banach space, T € B(X,Y) and S € B(Y, Z), and if 
either of T’ or S' is weakly compact, then so is ST. 


7 


Banach algebras 


Until this point our analysis focused mostly on single objects, for example, 
single functionals or, more generally, single operators. This chapter and 
Chapters 11-13 focus on algebras, more precisely: Banach algebras. These 
are Banach spaces with an additional product operation that turns them into 
algebras such that the norm is submultiplicative. Banach algebras extend and 
unify examples like B(X) for a Banach space X, Co(X) for a topological space 
X, and L'(R) with convolution as multiplication. 

The additional algebraic structure makes it possible to discuss invertible 
elements. This leads to the spectrum o(a) of an element a of a unital Banach 
algebra A, which is the set of all scalars \ such that Ae — a is not invertible in A 
(where e denotes the unit of A). The notion of the spectrum is truly fundamental 
as demonstrated in every basic course in linear algebra. The first section of this 
chapter is chiefly concerned with the spectrum and related notions. 

The next section restricts attention to commutative Banach algebras. To 
each such algebra A we associate a locally compact Hausdorff space ® called the 
Gelfand space of A and a non-trivial homomorphism T : A — Co(®) called the 
Gelfand representation, having crucial importance in Banach algebra theory. The 
Gelfand space ® is the space of non-zero complex homomorphisms on A, and 
I'(a) maps ¢ € ® to (a). So this map views certain functions on A as “points” 
on which elements of A act, thus reversing the natural order. It turns out that 
there is a lot to gain from this perspective, as demonstrated in the so-called 
commutative Gelfand—Naimark theorem (Theorem 7.16). The discovery of this 
was a major breakthrough. 

In the easiest case where A = Co(X) for a locally compact Hausdorff space 
X, we have ® = X and [ is the identify map. 

We then move on to a preliminary introduction of C*-algebras spanning two 
sections. A C*-algebra is a Banach algebra A having the additional algebraic 
structure of an involution and satisfying the so-called C*-identity: ||x* «|| = ||a||? 
for all x € A. It is incredible how much comes out of this seemingly naive further 
assumption! Most of the C*-algebra theory in this book appears in Chapters 11 


Introduction to Modern Analysis. Second Edition. Shmuel! Kantorovitz and Ami Viselter, Oxford University Press. 
© Shmuel Kantorovitz and Ami Viselter (2022). DOI: 10.1093/os0/9780192849540.003.0007 


194 7. Banach algebras 


and 13, while here our main goals are the celebrated commutative Gelfand— 
Naimark theorem, saying that the Gelfand representation of a commutative C*- 
algebra is an isometric *-isomorphism, and the continuous operational calculus 
of normal elements and its features. We cover the non-commutative counterpart 
of the Gelfand—Naimark theorem in Chapter 11. 

The chapter ends with a short discussion on the Arens products. For a Banach 
algebra A there are two natural products on its second dual A** each of which 
makes A** a Banach algebra in which A sits as a Banach subalgebra. We study 
these products and characterize the situation when they coincide, in which case 
A is called Arens regular. 


7.1 Basics 


This section introduces the theory of abstract Banach algebras, of which B(X) 
and Co(X) are two important examples. 

An algebra over a field F is a vector space A over F with a binary operation 
-:AxA— A making (A,+,-) a not-necessarily-unital ring, satisfying 


(a:b) = (Aa) -b =a- (Ad) 


for all a,b € A and X € F. (Note that algebras are associative by definition, but 
are not necessarily commutative.) We will usually drop the product symbol and 
write ab for a- b. 

If X is a Banach space, the Banach space B(X) (cf. Notation 4.3) is also 
an algebra under the composition of operators as multiplication. The operator 
norm (cf. Definition 4.1) clearly satisfies the relation 


IST] < SINT 0S, € BCX). 


If the dimension of X is at least 2, it is immediate that B(X) is not commutative. 

On the other hand, if X is a topological space, the Banach space Co(X) of 
all complex continuous functions on X that vanish at infinity with pointwise 
operations and the supremum norm ||/f|| = supx |f| is a commutative algebra, 
and again || fg|| < ||f||||g|| for all f, g € Co(X). If X is compact, then the function 
with the constant value 1 on X is the unit of the algebra. 


Definition 7.1. A (complex) Banach algebra is an algebra A over C, which is a 
Banach space (as a vector space over C) under a norm that is submultiplicative: 
|~y|| < |la|\ly|| for all a, y € A. We say that A is unital if it has an algebraic 
unit e. 


If we omit the completeness requirement in Definition 7.1, A is called a 
normed algebra. 

Note that the submultiplicativity of the norm implies the boundedness 
(i.e., the continuity) of the linear map of left multiplication by a, Dg : x > ax 
(for any given a € A), and clearly ||Zq|| < |la|| with equality when J is unital. 


7.1. Basics 195 


The same is true for the right multiplication map Rg : « > xa. Actually, 
multiplication is continuous as a map from A? to A, since 


llzy — a'y'|| < [lelllly — y'l + [la — 2'[Illy’l| + 0 


as [x’, y’] - [x,y] in A?. 

Remark also that in the definition of unital Banach algebras some texts add 
the assumption that ||e|| = 1. We do not assume this, but the algebra can be 
renormed as in Exercise 2 with an equivalent norm so that this condition is 
satisfied. 


Examples. 


1. B(X) is a unital Banach algebra for a (complex) Banach space X. 
2. Let X be a topological space. 


i. Co(X) is a commutative Banach algebra for a topological space X. It is 
unital if X is compact. In particular, cp := Co(N) is a Banach algebra. 
ii. Cy(X) is a unital commutative Banach algebra, where C,(X) is the 
Banach space of bounded continuous complex-valued functions on X 
with pointwise operations and the supremum norm. 
3. L©(X,A, pu) is a unital commutative Banach algebra for a measure space 
(X,A, 4) (with pointwise multiplication). 
4. For 1 < p < ~w, I? := L?(N,P(N), counting measure) is a commutative 
Banach algebra with term-wise multiplication: if « = {r,},y = {yn} € I, 
then x is bounded and ||2'||,; < ||2/|,,, so that 


1/p 


1/p 
zyll, = (= run) = (tot. = lel. yl, < lel, Ilyll, - 


I? is unital iff p = oo. 

5. Let D := {z € C;|z2| < 1} and A(D) := {f € C(D); f is holomorphic on D}. 
Then, with pointwise operations and the C(D)-norm, A(D) is a unital 
commutative Banach algebra, called the disc algebra. 


6. Li(IR) is a Banach algebra with respect to convolution; see Exercise 14. 
In fact, for an arbitrary locally compact topological group G, 
L1(G, Haar measure) is a Banach algebra with respect to convolution. It is 
unital iff G is discrete, and commutative iff G is abelian. 


Definition 7.2. Let A be a unital Banach algebra, and a € A. We say that a 
is regular (or non-singular) if it is invertible in A, that is, if there exists b € A 
such that ab = ba = e. If a is not regular, we say that it is singular. 


If a is regular, the element b in Definition 7.2 is uniquely determined (if also 
b’ satisfies the requirement, then b’ = b’e = b/(ab) = (0/'a)b = eb = b), and is 


called the inverse of a, denoted a~!. Thus aa~! = a~'a = e. In particular, a 4 0. 


196 7. Banach algebras 


We denote by G(A) the set of all regular elements of A. It is a group under 
the multiplication of A, and the map x > x~! is an anti-automorphism of G(.A). 
Topologically, we have: 


Theorem 7.3. Let A be a unital Banach algebra, and let G(A) be the group 
of regular elements of A. Then G(A) is open in A, and the map x + a1 is a 


homeomorphism of G(A) onto itself. 


Proof. Let y € G := G(A) and 6 := 1/|ly~1|| (note that ||y~"|| 4 0, since 
y + € G). We show that the ball B(y,5) is contained in G (so that G is indeed 
open). 

Let x € Bly, 5), and set a := y~'x. We have 


lle — al] = ly @ — 2)Il < Ny" Illy — all < [ly* [5 = 1. (1) 


Therefore, the geometric series 5°, ||e —a||" converges. By submultiplicativity of 
the norm, the series )7,, ||(e — a)”|| converges as well, and since A is complete, 
it follows (cf. Theorem 6.15) that the series 


Co 


S(e- a)” 


n=0 


converges in A to some element z € A (v° = e by definition, for any v € A). 
By continuity of L, and R, (with a=y'z), 


az = S/ ale -—a)"= Sve —(e-—a)|(e—a)” = WIC —a)" —(e-a)"""] =e, 


n n n 


and similarly za = e. Hence a € G and a7! = z. Since x = ya, also x € G, as 
wanted. 
Furthermore (for x € B(y,6)!), 


at =a-ty* = zy, (2) 
and therefore by (1) 


co 
et — y= Ie - ey" = 


<7 Jle-all"ly | = y= yo lea, 


n=1 


el 
=e =e yle) 


as x — y. This proves the continuity of the map x > x~! at y € G. Since this 
map is its own inverse (on G), it is a homeomorphism. 


7.1. Basics 197 


Remark 7.4. If we take in the preceding proof y = e (so that 6 = 1 anda ==), 
we obtain in particular that B(e,1) C G and 


Co 


ot=S\(e-2)” («€ Ble,1)). (3) 


n=0 


Since B(e, 1) = e — B(0,1), this is equivalent to 
(e—u) =u (u € B(O,1)). (4) 


Relation (4) is the abstract version of the elementary geometric series summation 
formula. 


For x € A arbitrary and \ complex with modulus >||zx||, since u := $2 € 


B(0,1), we then have e — $x € G, and 
oko 
n=0 A" 
Therefore Ae — x = A(e — $x) € G and 


(Ae — x) =n cart (5) 


(for all complex \ with modulus >|). 
Definition 7.5. The resolvent set of x € A is the set 

p(t) = {\ € C;Ae—2 € G} = f*(Q), 
where f : C + A is the continuous function f(A) := Ae — a. 


The complement of p(x) in C is called the spectrum of x, denoted o(x). Thus 
A € a(x) iff \e—a is singular. The spectral radius of x, denoted r(a), is defined by 


r(x) = sup{|A|;A € a(a)}. 


By Theorem 7.3, p(x) is open, as the inverse image of the open set G by the 
continuous function f. Therefore, a(x) is a closed subset of C. 
The resolvent of x, denoted R(-;x), is the function from p(a) to G defined by 


RA; 2) = (Ae— 2)" (A € p(2)). 


The series expansion (5) of the resolvent, valid for |A| > ||z\|, is called the 
Neumann expansion. 
Note the trivial but useful identity 


RA; xz) =AROA;2)-—e (AE p(a)). 


198 7. Banach algebras 


Ifa,y€ Aand 1 € p(ay), then 
(e—yx)le+ yR(1; cy)a] = (e— yx) + y(e— ry) RU; zy)2 =e, 


and 


le+ yR(1; ey)a|(e — yx) = (e— yr) + yR(1; ry) (e— ry)a =e. 
Therefore 1 € p(yx) and 


Rl; yx) =e+yR(1;xy)z. 


Next, for any \ 4 0, write \e—zy = Ale—($2)y]. If \ € p(xy), then 1 € p(($2)y); 
hence 1 € p(y($2)), and therefore \ € p(yx). By symmetry, this proves that 


o(xy) U {0} = o(yx) U {0}. 


Hence 
r(yx) = r(ay). 
With f as shown restricted to p(x), R(A;x) = f(A)~+; by Theorem 7.3, R(-;x) 
is therefore continuous on the open set p(x). 
By Remark 7.4, 


{AE CSA > [all} ¢ lz) (6) 
and te. 
ROs2) = So xaze" (IAL> lla. (7) 
n=0 
Thus 
(2) < AC, [lall) = {A € C/A < [al () 
and so 
r(2) < [lal (9 


The spectrum of « is closed and bounded (since it is contained in A(0, ||z|)). 
Thus a(x) is a compact subset of the plane. 


Theorem 7.6. Let A be a unital Banach algebra and « € A. Then a(x) is a 
non-empty compact set, and R(-;x) is an analytic function on p(x) that vanishes 
at oo. 


Proof. We observed already that o(a) is compact. By continuity of the map 
yy tj,(e-A12)-! = e 1 =e as A > ~, and therefore 


lim R(A;7) = lim \~'(e— A712)! =0. 
A> OO A> 0O 


For \ € p(x), since Ae — x and x commute, also the inverse R(A; x) commutes 
with x. If also uw € p(x), writing R(-) := R(-;x), we have 


R(w) = Ae — 2) ROA) R(u) = ARC) R(w) — eRO)R(u) 


7.1. Basics 199 


and 
R(A) = RO) (ne — 2) R(w) = wRO)R(p) — eR(A)R(u). 


Subtracting, we obtain the so-called resolvent identity 
R(w) — ROA) = (A- wh) ROAR). (10) 
For p # X in p(x), rewrite (10) as 


1 


aoe — R(A)) = —RO)R(u). 


Since R(-) is continuous on p(x), we have 


; 1 2 
Tina 7 — R()) =—-RO)’. 


This shows that R(-) is analytic on p(x), and R’(-) = —R(-)?. 

For any «* € A*, it follows that «* R(-) is a complex analytic function in p(x). 
If o(a) is empty, «*R(-) is entire and vanishes at oo. By Liouville’s theorem, 
x* R(-) is identically 0, for all 2* € A*. Therefore, R(-) = 0, which is absurd 
since R(-) has values in G(A). This shows that o(x) 4 0. 


Corollary 7.7 (The Gelfand—Mazur theorem). A unital Banach algebra 
that is a division algebra equals Ce. 


Proof. Suppose the unital Banach algebra A is a division algebra. If x € 
A,o(x) # @ (by Theorem 7.6); pick then A € o(zx). Since Ae — x is singular, 
and A is a division algebra, we must have Xe — x = 0. Hence x = Xe and 
therefore A = Ce. 


Let A be a unital Banach algebra. If p(\) = > ax" is a polynomial with 
complex coefficients, and « € A, we denote as usual p(x) := )> a,x" (where 
x° := e). The map p > p(z) is an algebra homomorphism T of the algebra of 
polynomials (over C) into A, that sends 1 to e and A to a. 


Theorem 7.8 (The spectral mapping theorem for polynomials). Let A 
be a unital Banach algebra. For any polynomial p (over C) and anyxe A 


o(p(x)) = p(o(z)). 
Proof. Let ~ = p(Ao). Then Ag is a root of the polynomial y — p, and therefore 
bs — P(A) = (A= Ao)a() 
for some polynomial q over C. Applying the homomorphism 7, we get 
pe — p(x) = (x — Age)q(x) = g(x) (x — roe). 
If  € p(p(x)), it follows that 


(@ — Ave) (g(a) R(u; p(@))) = (RU; p(@) )a(@))(@ — Ave) = e, 


200 7. Banach algebras 


so that Ao € p(x). Therefore, if Ao € a(x), it follows that yp := p(Ao) € a(p(x)). 


This shows that 
p(o(x)) C o(p(2)). 


On the other hand, factor the polynomial yz — p into linear factors 
u— pla) = eT o= Xk): 
Note that 4 = p(\,) for all k = 1,...,n. Applying the homomorphism 7, we get 
pe — p(x) =a Ile — Axe). 


If Ax € p(x) for all k, then the product here is in G(A), and therefore ws € p(p(«)). 
Consequently, if  € o(p(x)), there exists k € {1,...,n} such that A, € a(x), 
and therefore 4 = p(Ax) € p(a(x)). This shows that o(p(x)) C p(a(x)). 


Theorem 7.9 (The Beurling—Gelfand spectral radius formula). For any 
element x of a unital Banach algebra A, 


41: nj l/n = aH 
Sim Ja”! = r(x) = ing ln 
Proof. By Theorem 7.8 with the polynomial p(A) = A” (n € N), 

a(a@") = a(x)” := {XA";A € o(x)}. 
Hence, by (8) applied to #”,|X"| < ||x”|| for all n and A € o(x). Thus, |A] < 
la” ||1/” for all n, and therefore 

|A| < inf |x” ||1/" < lim inf ||x”||1/” 

neN n 

for all A € o(x). Taking the supremum over all such A, we obtain 


r(x) < inf ||" ||/" < lim inf |Ja"|[1/". (11) 
ne n 


For each «* € A*, the complex function x* R(-) is analytic in p(a), and since a(x) 
is contained in the closed disc around 0 with radius r(x), p(a) contains the open 
“annulus” r(x) < |A| < co. By Laurent’s theorem, x* R(-) has a unique Laurent 
series expansion in this annulus. In the possibly smaller annulus ||z|| < |A| < oo 
(cf. (9)), this function has the expansion (cf. (7)) 


.—~ £*(a") 
n=0 
This is a Laurent expansion; by uniqueness, this is the Laurent expansion of 


x* R(-) in the full annulus r(a) < |A| < oo. The convergence of the series implies 
in particular that 


* 1 n 
sup |2"(aqre)1 < 00 (IAL > r(2)) 


7.1. Basics 201 


for all 2* € A*. By Corollary 6.7, it follows that (whenever |\| > r(x)) 


1 
sup | sere” [= My <o. 


Hence, for all n € N and |A| > r(2), 
IIe" |] < My|Al"**, 


so that 
lim sup ||x”||1/” < |Al, 
n 


and therefore 
lim sup ||2”||'/" < r(2). (12) 


The conclusion of the theorem follows from (11) and (12). 


Definition 7.10. An element x of a unital Banach algebra is said to be quasi- 
nilpotent if lim ||x"||'/" = 0. 


By Theorem 7.9, the element x is quasi-nilpotent iff r(x) = 0, that is, iff 
a(x) = {0}. 

In particular, nilpotent elements (a«” = 0 for some n) are quasi-nilpotent. 

We consider now the boundary points of the open set G(A). 


Theorem 7.11. Let A be a unital Banach algebra and x be a boundary point 
of G(A). Then x is a (two-sided) topological divisor of zero, that is, there exist 
sequences of unit vectors {x,} and {x},} such that x,2 > 0 and xz’, > 0. 


Proof. Let « € 0G (:= the boundary of G := G(A)). Since G is open, there 
exists a sequence {y,} C G such that y, > # and x ¢G. 

If {||y;, + ||} is bounded (say by 0 < M < oo), and n is so large that ||2—yn|| < 
1/M, then 


lyn t — ell = [lym (@ — yn)Il S [lyn illle — yall <1, 


and therefore z := y;,' € G by Remark 7.4. Hence, 2 = ynz € G, contradiction. 
Thus, {||y;,+||} is unbounded, and has therefore a subsequence { ly I} diverging 
to infinity. Define 


1 
te yp, (KEN). 
lly Il" 
Then ||x,|| = 1 and 
1 
I|7x2|| = ||ee¥m, + tee — Yns)Il S lol > I|7x lll — yngll + 0, 
Nk 


and similarly rr, — 0. 


202 7. Banach algebras 


Theorem 7.12. Let 6 be a Banach subalgebra of the unital Banach algebra A 
such that e, € B. If x € B, denote the spectrum of x as an element of B by 
op(x). Then 


(1) o(x) C op(x) and 
(2) Oop(x) C Oa(x). 


Proof. The first inclusion is trivial, since G(B) C G(A), so that pg(x) C p(z). 
Let A € Oog(x). Then Ae—x € OG(B), and therefore, by Theorem 7.11, Ae—x 
is a topological divisor of zero in B, hence in A. In particular, Ae — x ¢ G(A), 
that is, \ € o(#). This shows that 0og(a) C a(x). Since pg(x) C p(x), we obtain 
(using part (1)): 
on(a) = p(x) 0 oB(@) C [p(x) O oB(x)| N o(@) 


= p(x) N o(x) = Oa(z). 


Corollary 7.13. Let B be a Banach subalgebra of the unital Banach algebra A 
such that e, € B, and x € B. Then og(x) = o(2) if either og(x) is nowhere 
dense or p(x) is connected. 


Proof. If ag(«) is nowhere dense, Theorem 7.12 implies that 
op(x) = Oop(x) C Oa(x) C a(x) C op(2), 


and the conclusion follows. 

If p(x) is connected and a(x) is a proper subset of og(x), there exists \ € 
op(2)M p(x), and it can be connected with the point at oo by a continuous curve 
lying in p(x). Since og(x) is compact, the curve meets Oog(x) at some point, and 
therefore 0op(x) N p(x) # O, contradicting Statement 2. of Theorem 7.12. 


The definition of the spectrum naturally requires the algebra to be unital. 
This can be “fixed” as follows. Let A be a non-unital Banach algebra. The 
Cartesian product Banach space A* := Ax C with the norm ||[z, A}|| = ||z|| +|A| 
and the multiplication 


[x, A] [yw] = [wy + Ay + px, Ap 


is a unital Banach algebra with the identity e := [0,1] in which A is 
(isometrically) naturally embedded. It is called the wunitization of A (see 
Exercise 1). For a € A, define (a), p(a), R(-; a) to be a 4#(a), p.4¥(a), Ryw(- a), 
respectively (the last one has values in A*). Then it is immediate that Theorems 
7.6, 7.8, and 7.9 hold for general Banach algebras. 

Recall that an ideal in an algebra A is a linear subspace M C A that is 
A-invariant, that is, AM, MA Cc M. In this case, the quotient space A/M is an 
algebra. Say that M is proper if it is not equal to A. If A is a Banach algebra and 
M is closed, the quotient norm on the Banach space A/M (cf. Theorem 6.16) 
is submultiplicative, that is, A/M is a Banach algebra. If, moreover, A is unital 
and M is proper, then A/M is also unital, with e + M as its algebraic unit. 
Section 7.2 discusses the case of commutative Banach algebras in more detail. 


7.2. Commutative Banach algebras 203 


7.2 Commutative Banach algebras 


Let A be a unital commutative Banach algebra. 

1. Let « € A. Then x € G(A) iff eA = A. Equivalently, x is singular iff 
vA # A, that is, iff 2 is contained in a proper ideal (2A). Proper ideals are 
contained therefore in the closed set G(A)°, and it follows that the closure of a 
proper ideal is a proper ideal. 

2. A maximal ideal M (in A) is a proper ideal in A with the property that 
if N is a proper ideal in A containing M, then N = M. Since the closure of 
M is a proper ideal containing M, it follows that mazimal ideals are closed. 
In particular A/M is a Banach algebra, and is also a field (by a well-known 
elementary algebraic characterization of maximal ideals). By Theorem 7.7, A/M 
is isomorphic (and isometric) to C. Composing the natural homomorphism A > 
A/M with this isomorphism, we obtain a (norm-decreasing) homomorphism ¢j¢ 
of A onto C, whose kernel is M. Thus, for any « € A, ¢4(2) is the unique scalar 
such that 2 + M = re 4+ M. Equivalently, d(x) is uniquely determined by 
the relation ¢y(x)e-—a EE M. 

Let & = ®(A) denote the set of all homomorphisms of A onto C. Note that 
@ € © iff g is a homomorphism of A into C such that ¢(e) = 1 (equivalently, iff 
@ is a non-zero homomorphism of A into C). The elements of ® are called the 
characters of A. 

The mapping M — ¢y described here is a mapping of the set M of all 
maximal ideals into ®. 

On the other hand, if 6 € ®, and M := kerd, then (by Noether’s “first 
homomorphism theorem”) A/M is isomorphic to C, and is therefore a field. By 
the algebraic characterization of maximal ideals mentioned, it follows that M is 
a maximal ideal. We have ker dy = M = ker@¢. For any x € A,ax — d(a)e 
ker d = ker gy, hence 0 = gy(x — O(a)e) = u(x) — G(x). This shows that 
o = om, that is, the mapping M —- @¢y, is onto. It is clearly one-to-one, because 
if M,N € M are such that dy = on, then M = kerg@y = kerdy = N. We 
conclude that the mapping M —> ¢y, is a bijection of M onto ®, with the inverse 
mapping ¢@ — ker ¢. 

3. If J is a proper ideal, then e ¢ J. The set U of all proper ideals containing 
J is partially ordered by inclusion, and every totally ordered subset Up has the 
upper bound [JU (which is proper because the identity does not belong to it) in 
U. By Zorn’s lemma, U/ has a maximal element, which is clearly a maximal ideal 
containing J. Thus, every proper ideal is contained in a maximal ideal. Together 
with 1., this shows that an element x is singular iff it is contained in a maximal 
ideal M. By the bijection established between M and ®, this means that x is 
singular iff é(2) = 0 for some ¢ € ®. Therefore, for any « € A,A € o(a) iff 
o(Ae — x) = 0 for some ¢ € ©, that is, iff \X = ¢(a) for some ¢. Thus 


a(x) = {9(a); 9 € ®}. (1) 


Therefore 
sup |¢(x)| = sup |A| :=r(z). (2) 
ped NE a(x) 


204 7. Banach algebras 


By (9) in Section 7.1, it follows in particular that |¢(x)| < ||z||, so that the 
homomorphism ¢ is necessarily continuous, with norm <1. If we assume that 
lle|| = 1, then since |||] > |d(e)| = 1, we have ||¢|| = 1 (for all ¢ € ®). 

We have then ® C S*, where S* is the strongly closed unit ball of A*. By 
Theorem 5.24, S* is compact in the weak* topology on A*. If dq is a net in ® 
converging weak* to h € S*, then 


h(zy) = lim ga(ry) = lim ba(x)Ga(y) = h(x)h(y) 


for all z,y € A and A(e) = lima ¢a(e) = 1, so that h € ®. This shows that ® 
is a closed subset of the compact space S* (with the weak* topology), hence ® 
is compact (in this topology). Since the weak* topology on A* is Hausdorff, we 
conclude that ® endowed with the relative weak* topology (called the Gelfand 
topology on ®) is a compact Hausdorff space. It has several names: the Gelfand 
space, the character space, the structure space, and the spectrum of A. 

4. For any x € A, let @ := (Ka)| 5, the restriction of kx : A* > C to ® 
(where « is the canonical embedding of A in its second dual). By definition of 
the weak* topology, «a is continuous on A* (with the weak* topology); therefore 
its restriction # to ® (with the Gelfand topology) is continuous. The function 
& € C(®) is called the Gelfand transform of x. By definition 


£(¢) = (2) (b€ ®), (3) 
and therefore, by (1), 
#(®) = o(c) (4) 
and 
lt l|o(@) = 7(2). (5) 


Note that the subalgebra A := {#;« € A} of C(®) contains 1 = é and separates 
the points of ® (if ¢ # w are elements of ®, there exists « € A such that 
o(x) # (a), that is, (9) # #(), by (3). 

5. It is also customary to consider M with the Gelfand topology of ® 
transferred to it through the bijection M — dy. In this case the compact 
Hausdorff space M (with this Gelfand topology) is called the maximal ideal 
space of A, and & is considered as defined on M through the described bijection, 
that is, we write ¢(M) instead of @(¢as), so that 


&(M)=¢m(a) (MeM). (6) 
The basic neighborhoods for the Gelfand topology on M are of the form 
N(Mo3%1,---,%n36) = {M © M; |%,(M) — &4(Mo)| <6, k =1,...,n}. 


6. The mapping [: x > & is clearly a representation of the algebra A into 
the algebra C(®) (or C(M)), that is, an algebra homomorphism sending e to 1: 


[T(x + y)(¢) = o(@ + y) = o(@) + Oy) = Pax +Ty) (9) 


7.2. Commutative Banach algebras 205 


for all ¢ € ®, etc. It is called the Gelfand representation of A. By (5), the map 
I is also norm-decreasing (hence continuous), and (cf. also (3)) 


kerT = {x € A;r(x) = 0} = {a € A; a(x) = {0}} 
= {x € A;x is quasi-nilpotent} = (|™. (7) 


We now describe succinctly the changes that need to be made to the foregoing 
discussion to construct the Gelfand representation when the commutative 
Banach algebra A is not necessarily unital. 

A. An ideal M in A is called modular if the quotient algebra A/M is unital, 
that is, if there exists u € A such that au—a,ua—a€é M for all a € A (such u 
is called a modular, or relative, unit for M/). Evidently, if A itself is unital, then 
every ideal is modular. Also, an ideal containing a modular ideal is also modular 
(with the same modular unit). Hence, Zorn’s lemma implies that every proper 
modular ideal is contained in a maximal modular ideal, and every such maximal 
modular ideal is a maximal ideal. One shows, mimicking the proof of Theorem 
7.3, that the closure of a proper modular ideal is, again, a proper modular ideal. 
In particular, a maximal modular ideal is closed. Once again, a character of 
A is a homomorphism of A onto C, or equivalently, a non-zero homomorphism 
of A into C. The kernel of a character is a maximal modular ideal in A. The 
map M —> ¢ y, defined previously is a bijection between the set M of maximal 
modular ideals and the set ® of all characters of A (note that A/M is unital 
because M is modular, so the Gelfand—Mazur theorem indeed applies). 

B. If A is not unital, then every ¢ € ®(A) has a unique extension to an 
element ¢7 € ®(A*), which is given by 6*(x + Ae) := (x) +A for alla Ee A 
and \ € C. We have ®(A*) = ®(A)* U {G20}, where d(x + Ae) = X for all 
z€Aand\€C. As a result, for each x € A, 


a(x) =o4x(x) = {6(x); € &(A*)} = {(x); ¢ € &(A)} U {0}. 


In particular, ® Cc S*. The set ® U {0} is closed in the weak* topology of A*. 
Hence, giving ® the relative weak* topology we obtain the Gelfand space, which 
is a locally compact Hausdorff space. 

C. The Gelfand transform # : ® + C of an element x € A is defined precisely 
as for unital A. It is plainly continuous, and it also vanishes at infinity, so 
& € Co(®). As before, ||Z||c,(a) = r(x). The Gelfand representation T : A > 
Co(®) given by « > # is a norm-decreasing algebra homomorphism. Its image 
A separates the points of ®, and it also vanishes identically at no point of ® in 
the sense that it is not possible for all @, x € A, to vanish at a particular ¢ € ®. 
Finally, (7) is valid too. 


Example. Let X be a locally compact Hausdorff space and consider A := 
Co(X). For each t € X, consider the “evaluation at t” character ¢; given by 
oe(f) := f(t) for all f € Co(X). One can show that ®(Co(X)) = {¢1;t € X}, and 
that the map t > ¢; is a homeomorphism of X onto ®(Co(X)) (see Exercise 17). 
Upon identifying these two topological spaces, the Gelfand representation of 
Co(X) is just the identity map! 


206 7. Banach algebras 


Consequently, o(f) = f(X) for each f € Co(X). This can also be proved by 
a straightforward argument. 


Let A be an arbitrary commutative Banach algebea. Since [ is a 
homomorphism, it follows from (5) that for all x,y € A, 


riety) Sra) try); r(ry) < r(@)r(y). (8) 


It follows that A is a normed algebra for the so-called spectral norm r(-) iff 
r(x) = 0 implies x = 0, that is (in view of (7)!), iff the so called radical of A, 
rad A := kerT, is trivial. In that case we say that A is semi-simple. By (7) and 
(3), equivalent characterizations of semi-simplicity are: 


(i) The Gelfand representation T' of A is injective. 


(ii) A contains no non-zero quasi-nilpotent elements. 


) 
(iii) A is a normed algebra for the spectral norm. 
(iv) The maximal modular ideals of A have trivial intersection. 
) 


(v) ® separates the points of A. 


Example. It is clear that for a locally compact Hausdorff space X, Co(X) is 
semi-simple. 


Theorem 7.14. Let A be a commutative Banach algebra. Then A is semi-simple 
and A is closed in Co(®) iff there exists K >0 such that 


llc? < K]la?|| (@ € A). 


In that case, the spectral norm is equivalent to the given norm on A andT is a 
homeomorphism of A onto A. T is isometric iff K = 1 (i.e., ||x||? = ||x?|| for all 


xe A). 


Proof. If A is semi-simple and A is closed, I is a one-to-one continuous linear 
map of the Banach space A onto the Banach space A. By Corollary 6.11, T is 
a homeomorphism. The continuity of [~' means that there exists a constant 
K > 0 such that ||z|| < VE |l2lloo(e) for all « € A. Therefore 


2 
llr? < K (sp ao) = K sup |x?(¢)| 
gEe® ped 


= K|lx*\Icg(a) < Ka". (9) 


Conversely, if there exists K > 0 such that ||a||? < K||x?|| for all a € A, it 
follows by induction that 


< Ky 1/2) +--+ (1/2") 1] 4.2” 11/2" 
|z|| < I|z* || 


for alln € N and « € A. Letting n — ov, it follows that 


||| < Kr(a) = K]ltlloyw)  (@ € A). (10) 


7.3. Involutions and C*-algebras 207 


Hence, kerl = {0}, that is, A is semi-simple, and [ is a homeomorphism of A 
onto A. Since A is complete, so is A, that is, A is closed in Co(®). 

If K = 1 (ie, if |lx||? = ||x?|| for all x € A), it follows from (10) that 
\|z|| = r(@) = ||#||c,() and I is isometric. Conversely, if I is isometric, it follows 
from (9) (with K = 1 and equality throughout) that ||«||? = ||x?|| for all x. 


7.3 Involutions and C*-algebras 


Let A be a semi-simple commutative Banach algebra. Since ® is a locally 
compact Hausdorff space (with the Gelfand topology), and A is a separating 
subalgebra of Co(®) that vanishes identically nowhere, it follows from 
Theorem 5.39 or its generalization in Exercise 9 of Chapter 5 that A is dense in 
Co(®) if it is selfadjoint. In that case, if J: f + f is the conjugation conjugate 
automorphism of Co(®), define 7 : A> A by 


GF :=T JIT. (1) 


Since JA Cc A and T maps A bijectively onto A (when A is semi-simple), 
J is well defined. As a composition of two isomorphisms and the conjugate 
isomorphism J, JZ is a conjugate isomorphism of A onto itself such that 
J? =I (because J? = I, where J denotes the identity operator in the relevant 
space). Such a map J is called an involution. In the non-commutative case, 
multiplicativity of the involution is replaced by anti-multiplicativity: 


I (ry) = T(y) F(z). 


So, to be precise, an involution on an algebra A is a conjugate-linear anti- 
multiplicative map J : A > A satisfying 7? = I. It is customary to denote 
x* := Jx whenever J is an involution on A (not to be confused with elements 
of the conjugate space!). An algebra with an involution is then called a *-algebra. 
A subalgebra of a *-algebra that is closed under the involution operation is said 
to be selfadjoint or a *- subalgebra. 

If A and B are *-algebras, a *-homomorphism (or isomorphism) f : A > B 
is a homomorphism (or isomorphism) such that f(«*) = f(x)* for all a € A. 

An element x in a *-algebra is normal if it commutes with its adjoint x*. 
Special normal elements are the selfadjoint (x* = x) and the unitary («* = 21) 
elements (the latter is relevant when A is unital). The identity is necessarily 
selfadjoint and unitary, because 

e* = ee* = e**e* = (ee*)* = e*™* =e. 
Every element x can be uniquely written as « = a+ib with a,b € A selfadjoint: 
we have a = Ra := (a + a*)/2, b= Sa := (a — a*)/2i, and x* =a — ib. Clearly, 
z is normal iff a,b commute. 

The “canonical involution” 7 defined by (1) on a semi-simple commutative 
Banach algebra is uniquely determined by the natural relation [7 = JT, that 
is, by the relation 
z*=% (x€ A), (2) 


208 7. Banach algebras 


which is equivalent to the property that @ is real whenever a is selfadjoint. 
In case the Gelfand representation T’ is isometric, Relation (2) implies the 
norm-identity: 


\|2*2|| = \la||?_ (@ € A). (3) 
Indeed 
\|a*xl| = ||P(a*2)|loy(@) = [124 lley(@) = IN]4P Ilo) 
= [!21l2, (a) = llzll?- 


A Banach algebra with an involution satisfying the norm-identity (3) (called 
the C*-identity) is called a C*-algebra. If X is any locally compact Hausdorff 
space, Co(X) is a (commutative) C*-algebra for the involution J. Theorem 7.16 
establishes that this is a universal model, up to C*-algebra isomorphism, for 
commutative C*-algebras. 

An example of a generally non-commutative C*-algebra is the Banach algebra 
B(X) of all bounded linear operators on a Hilbert space X. The involution is the 
Hilbert adjoint operation T + T*. Given y € X, the map x € X > (Tz, y) isa 
continuous linear functional on X. By the “Little” Riesz representation theorem 
(Theorem 1.37), there exists a unique vector (depending on T and y, hence 
denoted T*y) such that (Tx,y) = (#,T*y) for all x,y € X. The uniqueness 
implies that T* : X — X is linear, and (with the following suprema taken over 
all unit vectors x, y) 


||" || = sup |(#, T”y)| = sup |(Tx, y)| = ||T'|| < oo. 


Thus, T* € B(X), and an easy calculation shows that the map T > T* is an 
(isometric) involution on B(X). Moreover 


ZI? = IZ" TI) = | T*T|] = sup |(T*T 2, y) 
vy 
= sup |(Tx, Ty)| 2 sup ||T2||? = ||7||?. 
zy x 


Therefore, ||T*T|| = ||T||?, and B(X) is indeed a C*-algebra. Every closed 
selfadjoint subalgebra of B(X) (that is, amy C*-subalgebra of B(X)) is likewise 
an example of a generally non-commutative C*-algebra. The second Gelfand— 
Naimark theorem (see Chapter 11) establishes that this example is, up to 
C*-algebra isomorphism, the most general example of a C*-algebra. 

Note that in any C*-algebra, the C*-identity implies that the involution is 
isometric: 


Since |||? = ||2*2|| < ||2*|| [|x|], we have ||z|| < ||x*||, hence ||x*|| < 
||x**|| = ||a||, and therefore ||x*|| = |||. 
It follows in particular that ||Rz|| = ||(@ + 2*)/2|| < |la||, and similarly 
||Sa|| < ||z||. Furthermore, if the C*-algebra is unital, then |le||? = ||e*e|| = |lell, 


yielding that |le|| = 1. 


7.3. Involutions and C*-algebras 209 


Assume that A is a non-unital C*-algebra, and consider the unital algebra 
A# defined in Section 7.2. We turned it into a unital Banach algebra by giving 
it the norm ||z + Ae|| = ||z|| + |A| (@ € A,A © C). However, this norm does not 
make A* a C*-algebra. To this end we give A* the different norm 


ja+Aell:= sup |lzyt+rAyll (we A,A EC). 
yeA,|lyl|<1 


Together with the involution (x + Ae)* := x* + Ae, A* indeed becomes a unital 
C*-algebra in which A is (isometrically) naturally embedded; see Exercise 21. 
It is called the unitization of A (as a C*-algebra). Henceforth, A* denotes this 
C*-algebra. Note that the new norm on A* does not change anything in the 
construction or properties of the Gelfand space and representation in the non- 
unital case (points A. to C. in the previous section). For convenience, if A is a 
unital C*-algebra, we set A# := A. 


Lemma 7.15. Let A be a C*-algebra. 


(i) Every normal element x € A satisfies ||a||? = ||x?|| and ||z|| = r(a). 
(ii) If A is commutative, then it is semi-simple, and its involution coincides 
with the canonical involution. 
Proof. (i) Applying the C*-identity successively to x, 2*x, and x? 
the normality of x, we have 


and using 


||x||* = \la*al|? = |(@*2)*(w*2)|| = ||(@*)*2" || = ||27 |. 
Thus |||? = ||x?||. By induction we get ||x||?” = ||x?"|| for all n € N, so from 
Theorem 7.9 we infer that r(a) = lim,, ||2”||!/" = lim,, |x?" ||1/2" = |Ja\. 


(ii) Since A is commutative, each of its elements is normal. By part (i), 
Theorem 7.14 implies that [ is isometric. In particular A is semi-simple, so 
that the canonical involution J is well-defined (and uniquely determined by the 
relation [Y = JT). The conclusion of the lemma will follow if we prove that the 
given involution satisfies (2), or equivalently, if we show that @ is real whenever 
a € A is selfadjoint (with respect to the given involution). 

Suppose then that a € A is selfadjoint, but @ := Sa(¢) 4 0 for some ¢ € ®. 
We may assume that A is unital, for otherwise replace it by A* and replace ¢ 
by ¢% € ®(A#). Let a := Ra(¢) and b := (1/8)(a — ae). Then b is selfadjoint, 
and b(¢) = i. For any real X, since ¢ is contractive, 


(1+ A)? = |(1+A)i}? = |o(d + ide)/? 
< \|b + irel|? = ||(b + ire)*(b + ire)|| 
= ||(b — ide)(b + ire) || = |]? + rel] < |]B7|]| +7. 


Therefore, 2 < ||b?||, which is absurd since ) is arbitrary. 


The last lemma has the following neat consequence: a *-algebra admits at 
most one norm that makes it into a C*-algebra. Indeed, for every element x, since 


210 7. Banach algebras 


x*x is selfadjoint and hence normal, the C*-identity implies ||z||? = ||z*2|| = 
r(a*a), and the spectral radius depends only on the algebraic structure of A. 

Putting together all the ingredients accumulated, we obtain the following 
important result. 


Theorem 7.16 (The commutative Gelfand—Naimark theorem). Let A 
be a commutative C*-algebra. Then the Gelfand representation T is an isometric 
«-isomorphism of A onto Co(®). 


Proof. It was observed in the proof of Lemma 7.15 that T is a *-preserving 
isometry of A onto A. It follows in particular that A is a closed selfadjoint 
subalgebra of Co(®). As observed in the beginning of this section, commutativity 
and semi-simplicity of A and selfadjointness of A imply that A is dense in Co(®) 
by virtue of the Stone—Weierstrass theorem. Thus, these two algebras coincide. 


Let A be any (not necessarily commutative!) C*-algebra, and_ let 
x € A be selfadjoint. Denote by [x] the closure in A* of the set 
{p(x);p complex polynomial}. Then [z] is clearly the unital C*-subalgebra of 
A# generated by « and e,#, and it is commutative. Let T : y > @ be the 
Gelfand representation of [x]. Since it is a *-isomorphism and z is selfadjoint, 
@ is real. As we know from Section 7.2, the spectrum oj,)(x) is either equal to 
the image of @ or to this image with 0 added to it. Either way, we obtain that 
O[2](2) C R. Since a(x) C oj,](x) by Theorem 7.12, we get o(a) C R. This proves 
the following: 


Theorem 7.17. The spectrum of a selfadjoint element of a C*-algebra is real. 


In fact, the spectrum of an arbitrary element of a C*-algebra does not change 
when passing to a subalgebra (provided there is no change in the unit). Precisely: 


Theorem 7.18. Let 6B be a C*-subalgebra of the unital C*-algebra A with ey € 
B. Then G(B) = G(A) 1B and og(x) = o(x) for alla € B. 


Proof. If b € B is selfadjoint then og(b) = o(b) by Corollary 7.13: either apply 
Theorem 7.17 to obtain that og(b) is real, thus nowhere dense in C, or to obtain 
that o(b) is real, thus p(b) is connected. 

Since G(B) C G(.A) NB trivially, we must show that if z € B has an inverse 
x! € A, then x~! € B. The element x*z € B is selfadjoint, and has clearly the 
inverse x~!(a~1)* in A: 

la? ae)" [a2] = a ee te =H el = a, 
and similarly for multiplication in reversed order. Thus 0 ¢ o(a*x) = og(x*x) by 
the foregoing. Hence, x*x € G(B), and therefore the inverse 2~!(x~!)* belongs 


to G(B). Consequently 2~! = [x~!(a~')*]a* € B, as wanted. 
It now follows that pg(x) = p(x), hence og(x) = o(x), for all x € B. 


Finally, let us derive a few obvious consequences of the previous results. By 
Lemma 7.15, a normal quasi-nilpotent element of a C*-algebra is necessarily 


7.4. Normal elements 211 


zero, and the spectrum of a normal element x contains a complex number with 
modulus equal to ||x||. In particular, if x is selfadjoint, o(a) is contained in the 
closed interval [—||<||, ||a||] (cf. (8) following Definition 7.5 and Theorem 7.17), 
so either ||x|] or —||a|| (or both) belong to o(z). 


7.4 Normal elements 


Terminology 7.19. If x is a normal element of the arbitrary C*-algebra 
A, we still denote by [2] the unital C*-subalgebra of A* generated by x 
and e,#, that is, the closure in A* of all complex polynomials in x and 
x*, > axjx*(x*)J (finite sums, with a,j; € C). Since x is normal, it is clear 
that [az] is a commutative C*-algebra. By the commutative Gelfand—Naimark 
theorem, the Gelfand representation I’ of [a] is an isometric *-isomorphism of 
[x] onto C(®), where ® denotes the space of all characters (= non-zero complex 
homomorphisms) of [a] with the Gelfand topology. 

If ¢,w € © are such that #(¢) = #(~), then x*(¢) = £(¢) = &()) = a*(¥), 
and therefore §(¢) = 9(w), that is, d(y) = wW(y), for all y € [a], namely ¢ = . 
It follows that @ : ® — o(x) (cf. (4) in Section 7.2) is a continuous bijective 
map. Since both ® and o(x) are compact Hausdorff spaces, the map @ is a 
homeomorphism. It induces the isometric *-isomorphism 


E:f €C(o(a)) — fo&te C(®). 


The composition 7 := [~1o# is a unital isometric *-isomorphism of C(a(zx)) onto 
[x], which carries the identity function f(A) = \ onto [~!(£) = a. This isometric 
*-isomorphism is called the C(o(x))- (or continuous) operational calculus for the 
normal element x of A. It is customary to write f(x) instead of r(f). Note that 
f(a) is a normal element of A*, for each f € C(a(x)). Its adjoint is f(x). The 
isometricity of 7 means that 


If) =Ifllee@, WF e C(o(2))), 


and the fact that it is onto means that [2] = {f(x); f € C(o(x))}. 

The C(o(«))-operational calculus 7 sends any polynomial p(A) 
= Drag (A) to p(x) = Yaxjx*(x*)) € A* (the constant function 1 is 
mapped to e4¥). Denote by P the algebra of these polynomials, that is, in 
a complex variable and its conjugate. It is dense in C(o(x)) by the Stone— 
Weierstrass theorem (cf. Theorem 5.39). As a result, the C(a(x))-operational 
calculus for x is uniquely determined by the following weaker property: 7 : 
C(o(x)) — A* is a continuous unital *homomorphism that sends f; to 2. 
Indeed, such 7 sends any p € P to p(x), and by continuity, 7 is then uniquely 
determined on C(o(z)). 

For every f € C(a(z)), it follows from the foregoing that f(x) is the limit in 
A# of a sequence of polynomials in x and 2*. If A is not unital and f(0) = 0, 
then one can approximate f by elements of P that vanish at 0, that is, without 
a free term (why?), and since p(x) € A for such p, we have f(x) € A. 


212 7. Banach algebras 


Remark that in order to prove the existence of the continuous operational 
calculus, it suffices to extend the spectral mapping theorem to elements of P, that 
is, to show that for p € P we have o(p(x)) = p(a(x)) (recall that x is normal!). 
Indeed, by Lemma 7.15 this implies that ||p(x)|| = r(p(7)) = |lpllevo(ay). AS 
a result, the map that sends p € P to p(x) € A* is an isometric unital 
*-homomorphism. The density of P in C(o(x)) implies that this map extends to 
all of C(a(ax)), and the extension has the desired properties. 

The continuous operational calculus has numerous applications. For instance, 
suppose that A is unital and u € A is unitary. Then wu is normal, and by 
the properties of the continuous operational calculus, uu* being equal to e is 
equivalent to fifi = |fi|? being equal to 1 identically on o(u), that is, to o(u) 
being contained in the complex unit circle {z € C;|z| = 1}. 


Theorem 7.20 (The spectral mapping and composition theorems). 
Let x be a normal element of the C*-algebra A, and let f — f(a) be its 
C(o(ax))-operational calculus. Then for all f € C(a(a)), o(f(x)) = f(oe(x)), 
and furthermore, for all g © C(o(f(x))) (so that necessarily go f € C(a(az))), 
the identity g(f(x)) = (go f)(x) és valid. 

Ifz,y € A are normal, and f € C(a(x)Uo(y)) ts injective, then f(x) = f(y) 
implies x = y. 


Proof. Let yu € C. Since 7 is a unital isomorphism of C(o(«)) and [2], it follows 
that we,zx — f(a) = T(u — f) is singular in [2] iff uw — f is singular in C(o(2)), 
that is, iff there exists \ € o(a) such that = f(A). Since o( f(x)) = oy4#(f(2)) 
equals o/,)(f(x)) by Theorem 7.18, we conclude that p € o(f(x)) iff w € f(o(z)). 

The maps g > go f and h > A(x) are unital isometric +-homomorphisms 
of C(o(f(x))) = C(f(a(x))) into C(a(x)) and of C(a(x)) into A*, respectively. 
Their composition g — (go f)(x) is a unital isometric *-homomorphism of 
C(o(f(x))) into A* that carries f; to f(x). By the uniqueness of the continuous 
operational calculus for the normal element f(a), we have (go f)(x) = g(f(2)) 
for all g € C(a(f(a))). 

The last statement of the theorem follows by taking the continuous function 
g := f+ (whose domain is the image of f) in the last formula applied to both x 
and y: 


«= fi(x) = (go f)(x) = g(f(x)) = 9 FY) = Geo AY) = fl) =y. 


7.5 The Arens products 


Let A be a Banach algebra. This section shows how to construct two products 
on the second dual A** of A, each of which turns A** into a Banach algebra. 

It will be convenient to adopt the following convention. If V is a vector space 
(such as A or A*), 2 € V and f is a linear functional on V, we write (x, f) or 
(f,2) for f(e). 


The two Arens products on A** are defined in three steps: 


7.5. The Arens products 213 


(A) Let w € A* and a € A. Define elements aw,wa € A* by 
(wa, b) := (w, ab) and (b, aw) := (ba, w) 


(“moving a to the other side”), that is, (wa)(b) := w(ab) and (aw)(b) := 
w(ba), for each b € A. It is clear that aw,wa are indeed bounded linear 
functionals on A, namely elements of A*, of norm at most |lal] ||w]|. 


(B) For w € A* and X € A**, define elements Xw,wX € A* by 
(a,wX) := (aw, X) and (Xw,a) := (X,wa) 


(“moving w to the other side”), that is, (wX)(a) := X(aw) and (Xw)(a) := 
X (wa), for each a € A. It is clear that wX, Xw are indeed bounded linear 
functionals on A, namely, elements of A*, of norm at most ||X'|| |||]. 


(C) Finally, for X,Y € A**, define elements X OY, X OY € A*™* by 


(X OY,w) := (X,Yw) and (w,X OY) := (wX,Y), 


that is, (X OY)(w) := X(Yw) and (X OY)(w) := Y(wX), for each w € A*. 
It is clear that X OY, X © Y are indeed bounded linear functionals on A%*, 
namely, elements of A**, of norm at most ||X|| ||]. 


The notation we used in (A) and (B) is suggestive, because: 


(i) The left and right operations defined in (A) turn A* into an A-bimodule. 


(ii) The operations of A** on A* defined in (B) “extend” the operations of A 
on A* defined in (A): for w € A* and a € A we have wa = wa and dw = aw, 
where « is the canonical embedding of A in A** and @ = «(a). 


Each of the binary operations 0 and ¢ turns A** into an algebra. They are called 
the first and the second Arens products on A**, respectively. As the plural form 
suggests, these products do not always agree. When they do agree, we say that 
A is Arens regular. See the exercises for examples and non-examples of Arens 
regularity. We prove in an exercise in Chapter 12 that all C*-algebras are Arens 
regular. However, generally speaking, Arens regularity is a rare property. 

Each of the Arens products is submultiplicative, and thus turns A** into a 
Banach algebra. Furthermore: 


(iii) Each of the Arens products “extends” the product in A: 406 = ib =a0b 
for all a,b € A. 


(iv) The Arens products agree when one of the multiplicands comes from A: 
XO@=X OG@anda0X =40X forall X € A* anda€c A. 


(v) The operations of A** on A* defined in (B) turn A®* into a left (A**, 
module and a right (A**, >)-module. 


WN 
| 


From (iii) we see that each of the Banach algebras (A**, 0) and (A**, ©) contains 
A as a Banach subalgebra. 


214 7. Banach algebras 


Separate continuity in the weak*-topology on A** (that is, the A*-topology) 
is subtle: 
(vi) The following maps on A** are continuous with respect to the weak*- 
topology: 
e for each fixed Y € A**, the map X > XOY; 
¢ for each fixed X € A**, themap Y > X OY. 


Facts (i)-(vi) follow easily from the definitions. 

Let X,Y € A™, and let {a;} ,{b;} be nets in A such that {@;}, {b;} converge 
to X,Y, respectively, in the weak*-topology on A**. By (iii), (iv), and (vi) we 
have 


a 


XOY =lima Y = lima; 0 ¥ = lim (tim 0 6 ) = tim (tim 2; ) 
4 a a J J 


and 


XOY =limX 06; =limX bj = lim (lim a b;) = lim (limaidy) , 
J By) j a j 4 


where all limits are in the weak*-topology. These formulae express the difference 
between the two Arens products lucidly. 

We proceed to establish a criterion for Arens regularity. As preparation, fix 
w € A* and consider the module operation maps A, p : A — A* defined by 


(a) :=aw, p(a):=wa (a € A). 
These are bounded linear maps whose Banach adjoints »*, p* : A** > A* satisfy 
M(X):=wX, p(X) := Xw (VX € A®™). 
Indeed, A*(X) = X oA maps a € A to (aw, X) = (a,wX) and thus equals wX, 


and similarly for p*. Taking the adjoint again, we get the maps \**, p** : A** > 
A***, which satisfy 


AM (Y)) (4) =(XOY,w), (OMX) VY) =(XOY,w) (WX, Y €A™). 


Indeed, A**(Y) = Y o \* maps X to (wX,Y) = (w,X OY), and similarly for 
p*. 

Theorem 7.21. Let A be a Banach algebra and w € A*. The following 
conditions are equivalent: 


1. For each X,Y € A*™* we have (X OY,w) = (X OY,w). 
2. For each fired X € A**, the linear functional Y > (X OY,w) on A*™* is 


By 


weak* -continuous (that is, continuous in the A*-topology on A**). 


3. For each fixed Y € A**, the linear functional X > (X © Y,w) on A*™* is 
weak* -continuous. 


Exercises 215 


4. The set {wa;a € A, ||a|| <1} has compact closure in the weak topology of 
A* (that is, the A**-topology). 
5. The set {aw;a € A, |la|| <1} has compact closure in the weak topology of 
A*. 
Proof. The equivalences 1 <= > 2 and 1 <=> 3 follow from (iv) and (vi). 
Indeed, 1 implies both 2 and 3 by (vi). Conversely, if, for instance, 2 holds and 
X,Y € A™, then taking nets {a;},{b;} in A such that {aj}, {b;} converge to 
X,Y, respectively in the weak*-topology, we get 


(X OY,w) = lim (x bj,w) = lim (X 0 6j,w) = (X OY,w) 
j j 
by (iv) and (vi), proving 1. 

In the rest of the proof we use the maps A,p : A > A* associated with 
w as in the paragraph preceding the theorem. We require the notion of weak 
compactness of operators (see Exercise 26 of Chapter 6). Notice that 4 and 5 
mean weak compactness of p and w, respectively. 

Condition 2 precisely means that for all X € A**, the functional p**(X) € 
A*** is continuous in the A*-topology on A**, which, by Theorem 5.23, means 
that p**(X) belongs to the canonical image of A* in A***. By Part (a) of 
Exercise 26 of Chapter 6, this is equivalent to weak compactness of p, namely, 
to 4. Similarly, 3 is equivalent to 5. 


A linear functional w € A* satisfying the equivalent conditions of 
Theorem 7.21 is called weakly almost periodic. Plainly, A is Arens regular iff 
all elements of A* are weakly almost periodic. 


Exercises 


1. Let A be a non-unital Banach algebra. Consider then the Cartesian 
product Banach space A#* := A x C with the norm ||[z, A}|| = ||z|| + |A| 
and the multiplication 


[x, A] ly, we] = [wy + Ay + px, Ap]. 


Prove that A* is a unital Banach algebra with the identity e := [0,1], 
commutative if A is commutative, and the map « € A = [z,0] € A# is 
an isometric isomorphism of A onto a maximal ideal (identified with A) 
in A*. (With this identification, we have A* = A+ Ce.) 

If ¢ is a homomorphism of the commutative Banach algebra A into C, 
it extends uniquely to a homomorphism (also denoted by ¢) of A* into C 
by the identity ¢([x, A]) = ¢(x) + A. Conclude that ||¢|| = 1. 


2. The requirement ||xy|| < ||z||||y|| in the definition of a Banach algebra 
implies the joint continuity of multiplication. Prove: 


216 


7. Banach algebras 


(a) If A is a Banach space and also an algebra for which multiplication is 
separately continuous, then multiplication is jointly continuous. (Hint: 
consider the bounded operators Lz : y > xy and Ry: x — «yon A 
and use the uniform boundedness theorem.) 


(b) We use the notation as in Part (a). The norm |2| := ||L,.|| (where the 
norm on the right is the B(A)-norm) is equivalent to the given norm 
on A, and satisfies the submultiplicativity requirement |ry| < |a|ly|. 
If A has an algebraic identity e, then |e] = 1. 


. Let A be a unital complex Banach algebra. If F Cc A consists of commuting 


elements, denote by Cr the maximal commutative Banach subalgebra of 
A containing F’. Prove: 


(a) oc,(a) = o(a) for all a € Cp. 
(b) If a,b € A commute, then 


a(a+b) C o(a) + 0(b) and o(ab) C o(a)o(b). 
Conclude that 
r(a+b) <r(a)+r(b) and r(ab) < r(a)r(b). 
(c) For alla € Aand X € p(a), 


os 
d(X, o(a)) 


(Hint: use the Gelfand representation of Cy,4}-) 


r(R(a)) = 


. Let A be a commutative unital Banach algebra (over C). A set E C A 


generates A if the minimal closed subalgebra of A containing F and the 
identity e coincides with A. In that case, prove that the maximal ideal 
space of A is homeomorphic to a closed subset of the Cartesian product 


Ileer o(a). 


. Let A be a unital Banach algebra, and let G be the group of regular 


elements of A. Suppose {a,} C G has the following properties: 


(i) dn > a and ana = aay, for all n; 


(ii) the sequence {r(a;,!)} is bounded. 


Prove that a € G. (Hint: observe that r(e — a; 1a) < r(az!)r(an — a) > 0, 


hence, 1 — o(a,'a) = o(e — aa) C B(0,1/2) for n large enough, and 
therefore 0 ¢ o(a;,'a).) 


. Let A be a unital Banach algebra, a € A, and » € p(a). Prove that 


 € p(b) for all b € A for which the series s(A) := 5° 
converges in A, and for such b, R(A;b) = R(A;a)s(A). 


(6 — a) R(A; a)]" 


n 


Exercises 217 


7. Let A and a be as in Exercise 6, and let V be an open set in C such that 
a(a) C V. Prove that there exists 6 > 0 such that o(b) C V for all 6 in 
the ball B(a,5). (Hint: if M is a bound for R(-;a) on the complement of 
V in the Riemann sphere, take 6 = 1/M and apply Exercise 6.) 


8. Let ¢ be a non-zero linear functional on the unital Banach algebra A. 
Trivially, if ¢ is multiplicative, then ¢(e) = 1 and ¢ 4 0 on G(A). The 
following steps provide a proof of the converse. Suppose ¢(e) = 1 and 
¢ #0 on G(A). Denote N = ker ¢ (note that NM G(A) = 0). Prove: 


(a) d(e,N) =1. (Hint: if ||e — z|| < 1, then x € G(A), hence « ¢ N.) 

(b) @ € A* and has norm 1. (Hint: ifa ¢ N, a, := e—¢(a)~'a € N, hence 
d(e,a,) > 1 by Part (a).) 

(c) Fix a € N with norm 1, and let f(A) := ¢(exp(Aa)) (where the 
exponential is defined by means of the usual power series, converging 
absolutely in A for all \ € C). Then f is an entire function with no 
zeros such that f(0) = 1, f’(0) = 0, and |f(A)| < el. 

(d) (This is a result about entire functions.) If f has the properties listed 
in Part (c), then f = 1 identically. Sketch of proof: since f has no 
zeros, it can be represented as f = e% with g entire; necessarily g(0) = 
g'(0) = 0, so that g(A) = AA(A) with h entire, and Rg(A) < |A|. For 
any r > 0, verify that |2r—g| > |g| in the disc |\| < r and |2r—g| > 0 
in the disc || < 2r. Therefore, F(A) := [r?h(A)]/[2r—g(A)] is analytic 
in |A| < 2r, and |F'| < 1 on the circle |A| = r, hence in the disc |A| <r 
by the maximum modulus principle. Thus 


ele r r 
say <r (ls. 


Given 4, let r > oo to conclude that h = 0. 


(ce) Ifa € N, then a? € N. (Hint: apply Parts (c) and (d) and look at the 
coefficient of A? in the series for f.) 


(f) (a?) = (x)? for all x € A. (Represent x = x; + d(x)e with x1 € N 
and apply Part (e).) In particular, 2 € N iff x? € N. 


(g) If either x or y belong to N, then (i) y+ yx € N; (ii) (xy)? + (yx)? € 
N; and (iii) ey — yx € N. (For (i), apply Part (f) to «+ y; for (ii), 
apply (i) to yxy instead of y, when «x € N; for (iii), write (zy — yx)? = 
2[(xy)? + (yx)?] — (zy + yx)? and use Part (f).) Conclude that N is 
a two-sided ideal in A and ¢ is multiplicative (use the representation 
x=2,4+ d(x)e with 7, € N). 


218 


10. 


11. 


12. 


7. Banach algebras 


. Let A be a unital Banach algebra. For a,b € A, denote C(a,b) = La — Rp 


(cf. Section 7.1), and consider the series 


(—1) R(A; a)’ *"[C(a, b)’e]; 


i 


bi (A) 


be (A) = S-[C(b, a) R(A a) 


j=0 


for \ € p(a). Prove that if by (A) (br (A)) converges in A for some X € p(a), 
then its sum is a left inverse (right inverse, respectively) for Ae — b. In 
particular, if A € p(a) is such that both series converge in A, then  € p(b) 
and R(A;b) = bi (A) = br (A). 


Use notation as in Exercise 9. Set 


r(a,b) = limsup ||C(a, 6)"el|!/”, 


and consider the compact subsets of C 
o(a,b) = {A € C; d(A,o(a)) < r(a,b)}; 
on(a,b) = {A € C; d(A, a(a)) < r(b, a)}; 
a(a,b) = o, (a,b) Uor(a, Bb). 


Prove that the series b(A) (bR(A)) converge absolutely and uniformly 
on compact subsets of o,(a,b)° (or(a,b)°, respectively). In particular, 
a(b) C o(a,b), and R(-;b) = by = bp on o(a,b)°. 


Use notation as in Exercise 10. Set 
d(a,b) = max{r(a,b), r(b, a)}, 
so that trivially 
o(a,b) = {A; d(A, o(a)) < d(a, b)} 


and o(a,b) = a(a) iff d(a, b) = 0. In this case, it follows from Exercise 10 
(and symmetry) that o(b) = o(a) (for this reason, elements a, b such that 
d(a,b) = 0 are said to be spectrally equivalent). 


Let D be a derivation on a unital Banach algebra A, that is, a linear map 
D:A-— Asuch that D(ab) = (Da)b + a(Db) for all a,b € A. (Example: 
given s € A, the map D, := L, — R, is a derivation; it is called an inner 
derivation.) Prove: 


(a) If D is a derivation on A and Dv commutes with v for some v, then 
Df(v) = f’(v)Dv for all polynomials f. 


Exercises 219 


(b) Let s € A. The element v € A is s-Volterra if Dv = v? ae 
in A= B(L?((0,1])), take S: f(t) > tf(t) and V: f(t) > fo f° 
the so-called classical Volterra operator.) Prove: (i) D,v” = Ses 
(ii) v"tt = D?v/n!; and (iii) C(s + av, s + Bv)"e = (-1)*nt(#=2) ur, 
for alln € Nanda, €C. 

(c) If v € A is s-Volterra, then (i) ||v"|/!/" = O(1/n). In particular, 
v is quasi-nilpotent. (ii) r(s + av,s+ 6v) = 0 if B-aeéEeNU 
{0}, and = lim sup(n!||v”||)!/" otherwise. (iii) i +av,s + Bv) = 
lim sup(n!||u”||)!/”) if a ¢ B. (iv) For a 6, s+ av and s + Bu are 
spectrally equivalent iff ||v"||!/" = o(1/n). (v) d(s + av,s + Bv) < 
diam o(s). (vi) d(S + aV,S + 6V) = 1 when a F B (cf. Part (b) for 
notation). In particular, S+aV and S+8V are spectrally equivalent 
iff a = 6 (however, they all have the same spectrum, but do not try to 
prove this here!). Note that if 6 —a €N, then r(S+aV,S5+ 6V)=0 
while r($ + 6V,S+aV) =1. 


(d) If v is s-Volterra, then 


R(A;v) = A7*e + A~? exp(s/A)v exp(—s/A) (A #0). 
(e) If v is s-Volterra, then for all a,A € C 
exp[A(s + av)] = exp(As)(e + Av)* = (e — Av) exp(Av), 


where the binomials are defined by means of the usual series (note 
that v is quasi-nilpotent, so that the binomial series converge for all 
complex 2). 


(f) If v is s-Volterra and p(s) is connected, then o(s + kv) C o(s) for all 
k € Z. For all \ € p(s), 


k 
R(A; 8 + kv) => (5 fIR(A; 8) 107 (k > 0); 
j=0 
So (-1) j jlv R(A;s)9*1 (k <0). 


(Apply Exercise 9.) If p(s) and p(s+kv) are both connected for some 
integer k, then o(s + kv) = o(s). In particular, if o(s) C R, then 
o(s + kv) = o0(s) for all k € Z. 


13. Let A be a unital Banach algebra, and let a,b,c € A be such that 
C(a, b)c = 0 (ie., ac = cb). Prove: 


(a) C(e*,e’)e = 0 (ie., ete = ce’, where the exponential function e* 
is defined by the usual aibsolitely convergent series; the base of the 
exponential should not be confused with the identity of A!). 


220 


14. 


15. 


7. Banach algebras 


(b) If A is a C*-algebra, then e*~* is unitary, for any « € A. (In 
particular, ||e*~*" || = 1.) 


(c) If A is a C*-algebra and a,b are normal elements (such that ac = cb, 
as before!), then 


lle ee" || < lel. 
(d) For a,b,c as in Part (c), define 
fa)=e™ce*”” (AEC). 


Prove that || f(A)|| < ||c|| for all A € C, and conclude that f(A) = c for 


* 


all \ (ie., e4*° ¢ = ce for all XE C). 


(e) If A is a C*-algebra, and a,b are normal elements of A such that 
ac = cb for some c € A, then a*c = cb*. (Consider the coefficient of » 
in the last identity in Part (d).) 


In particular, if c commutes with a normal element a, it commutes also 
with its adjoint; this is Fuglede’s theorem. 


Consider L1(R) (with respect to Lebesgue measure) with convolution 
as multiplication. Prove that L+(R) is a non-unital commutative 
Banach algebra, and the Fourier transform F' is a _ contractive 
(i.e., norm-decreasing) homomorphism of L1(R) into Co(R) (cf. Exercise 7, 
Chapter 2). 


Let ¢ be a non-zero homomorphism of the Banach algebra L1 = L'(R) 
into C, that is, a character of L' (cf. Exercise 14). Prove: 


(a) There exists a unique h € L® = L™(R) such that ¢(f) = [ fhdz for 
all f € LZ, and |{A||.. = 1. Moreover, 


o(fy)O(9) = Of) ogy) (fg € L*; y ER), 


where f, (a) = f(a —y). 
(b) For any f € L+ such that ¢(f) 4 0, 
(i) h(y) = o(fy)/O(f) ae. (in particular, h may be chosen to be 
continuous). 
(ii) O(fy) £0 for all y ER. 
(iii) |h(y)| = 1 for ally ER. 
(iv) h(a + y) = h(x)h(y) for all z,y € R and A(0) = 1. 


Conclude that h(y) = e~*¥ for some t € R (for all y) and that 
o(f) = (Ff)(t), where F is the Fourier transform. 


Exercises 221 


16. 


17. 


18. 


Conversely, each t € R determines the homomorphism ¢;(f) 

= (F'f)(t). Conclude that the map t > ¢; is a homeomorphism of R 

onto the Gelfand space ® of L+ (recall that it is the space of characters of 

L' with the Gelfand topology). (Hint: the Gelfand topology is Hausdorff 
and is weaker than the metric topology on R.) 


Let A,B be commutative Banach algebras, 6 semi-simple. Let rt: A > B 
be an algebra homomorphism. Prove that 7 is continuous. (Hint: for each 
@ € ®(B) (the Gelfand space of B), dor € ®(A). Use the closed graph 
theorem.) 


(a) Let X be a compact Hausdorff space, and let A = C(X). Prove that 
the Gelfand space ® of A is homeomorphic to X. (Hint: consider the 
map t € X + ¢ € ®, where ¢;(f) = f(t), (f € C(X)) (this is the 
“evaluation at t” homomorphism). If 3¢é € ® such that ¢ # ¢; for all 
t © X and M = ker @, then for each t € X there exists f, € M such 
that f:(t) 4 0. Use continuity of the functions and compactness of X 
to get a finite set {f;,} C M such that h:= >| ft,|? > 0 on X, hence 
h € G(A); however, h € M, contradiction. Thus, t + ¢; is onto ®, 
and one-to-one (by Urysohn’s lemma). Identifying ® with X through 
this map, observe that the Gelfand topology is weaker than the given 
topology on X and is Hausdorff.) 


— 
= 
Ww 


Extend part (a) to locally compact Hausdorff spaces X and A := 
Co(X), proving again that the map t ¢€ X > G& € ®isa 
homeomorphism. (Hint: assuming by contradiction that ¢ € ® is not 
an evaluation character and writing M := ker ¢, for every compact 
K Cc X there isa h € M with h > 1 on K. Use this and Urysohn’s 
lemma to deduce that C.(X) is contained in M, so density of C.(X) in 
Co(X) and closedness of M imply that M = Co(X), a contradiction. 
Finally, Urysohn’s lemma can be useful again when comparing the 
two topologies on X.) 


Let U be the open unit disc in C. For n € N, let A = A(U”) denote the 
Banach algebra of all complex functions analytic in U" := U x --- x U 
and continuous on the closure U” of U” in C”, with pointwise operations 
and supremum norm || f||,, := sup{|f(z)|;z €¢ U"}. Let ® be the Gelfand 
space of A. Given f € A and 0 < r < 1, denote f(z) = f(rz) and 
Z(f) ={z €U™; f(z) = 0}. Prove: 


(a) f, is the sum of an absolutely and uniformly convergent power series 
in U". Conclude that the polynomials (in n variables) are dense in A. 
(b) Each ¢ € ® is an “evaluation homomorphism” ¢, for some w € U”, 
where ¢y(f) = f(w). (Hint: consider the polynomials p;(z) = z; 
(where z = (2,...,2n)). Then w := (6(p1),.--,¢(Pn)) € U™ and 
o(p;) = p;(w). Hence ¢(p) = p(w) for all polynomials p. Apply 


222 


19. 


20. 


21. 


7. Banach algebras 


Part (a) to conclude that ¢ = ¢y. The map w + ¢y is the wanted 
homeomorphism of U" onto ®.) 

(c) Given fi,...,fm € A such that (~L,Z(fe) = 9, there exist 
Jiy-++59m € Asuch that 3°, frgx = 1 (on U"). (Hint: otherwise, the 
ideal J generated by fi,..., fm is proper, and therefore there exists 
@ € ® vanishing on J. Apply Part (b) to reach a contradiction.) 


Let A be a unital Banach algebra such that 


2 
K := sup al < 00 


o#aeA ||a7|| 


Prove: 


(a) |lal| < Kr(a) for alla € A. 


(b) ||p(a)|| < K ||pllo(e(ay) for all p € P, where P denotes the algebra of 
all polynomials of one complex variable over C. 
(c) Ifa € A has the property that P is dense in C(a(a)), then there exists 


a continuous algebra homomorphism (with norm < K) tT : C(a(a)) > 
A such that 7(p) = p(a) for all p € P. 


Let kK Cc C be compact # 0, and let C(K) be the corresponding 
Banach algebra of continuous functions with the supremum norm || f||«K := 
sup; |f|. Denote Py := {p € P; ||p|l~x <1} (cf Exercise 19 b). 


Let X be a Banach space, and T € B(X). For « € X, denote 
lz llr = sup ||p(T)z"|; 
PEP, 
Zr := {a € X; ||z||r < co}. 
Prove: 


(a) Zr is a Banach space for the norm || - ||7 (which is greater than the 
given norm on X). 


(b) llp)llazr) <1. 

(c) If the compact set K is such that P is dense in C(K), there exists 
a contractive algebra homomorphism 7 : C(K) > B(Zr) such that 
T(p) = p(T) for allp € P. 


Let A be a non-unital C*-algebra. Prove that the unitization A® defined in 
Section 7.3 is a (unital) C*-algebra in which A is isometrically embedded. 
(Hint: the proof should show that the norm on A* is indeed a norm, that 
it extends that of A, that it is submultiplicative and that it satisfies the 
C*-identity, and that A* is complete. For the last part, note that the 
quotient A#/A is one dimensional and use Exercise 22 of Chapter 6.) 


Exercises 223 


22. 


23. 


24. 


25. 


26. 


27. 


28. 


29. 


30. 


Let X be a non-compact, locally compact Hausdorff space and let Y 
be its Alexandroff one-point compactification. Find natural isometric *- 
isomorphisms Oo(X)* ~ {f + Al; f € Co(X)} & C(Y), where Co(X)# 
is the unitization of the C*-algebra Co(X) defined in Section 7.3 and the 
middle algebra is viewed as a C*-subalgebra of C,(X). 


Let A be a Banach algebra with an involution such that |/a||? < ||a*a|| for 
all a € A. Prove that A is a C*-algebra. 


Let u be a unitary element in a unital C*-algebra. Without using the 
operational calculus, explain why o(u),a(u~!) Cc {z € C;|z| < 1}, and 
use the relation o(u~!) = o(u)~! := {z71;z € o(u)} to conclude that 
a(u) C {z € C;|z| = 1}. (We proved this in a less elementary fashion.) 


Let A be a C*-algebra and 0 4 p € A be a selfadjoint idempotent: p? = 
p =p”. Prove that pAp := {pap;a € A} is a C*-subalgebra of A with p 
as its unit (such a subalgebra is called a corner of A). Assuming that p 
is not a unit of A, prove that o.4(x) = op4p(x) U {0} for all x € pAp (this 
part of the exercise is purely algebraic). 


This exercise supplements Theorem 7.18. Let 6B be a C*-subalgebra of a 
C*-algebra A. Prove that if B has a unit eg which is not a unit for A, then 
a4(2) = op(x)U {0} for all 2 € B; while in any other case, og(x) = o4(2) 
for all x € B. Hint: if A and B have the same unit, we are on the premises of 
Theorem 7.18. If B is not unital, note that B* is algebraically isomorphic 
to the C*-algebra span(6 U {e,#}). Finally, assume that B has a unit 
eg which is not a unit for A, and consider the unital C*-algebra eg Aeg, 
which contains B. 


Let A be a C*-algebra and x € A be normal. Prove: 


(a) If o(x) C R then z is selfadjoint. 
(b) If A is unital and o(x) C {z € C;|z| = 1}, then z is unitary. 


Let A be a unital C*-algebra and u € A be unitary. Suppose that a(u) # 
{z € C;|z| = 1}. Prove that there is a selfadjoint a € A such that u = e’*. 
Compare Part (a) of Exercise 5 in Chapter 9. 


Let A be a unital C*-algebra. Suppose that x € A is normal and p is an 
isolated point of o(a). Define e,, € C(o(x)) to be 1 at yw and 0 elsewhere. 
Prove that e,,(a) is a non-zero selfadjoint idempotent and that xe,,(x) = 
pe,,(z). Compare Theorem 9.8 and Section 9.11. 


Let A be a non-unital C*-algebra and x € A be normal. 


(a) Prove that the (generally non-unital) C*-subalgebra B of A generated 
by «x is equal to {f(x); f € C(a(x)), f(0) = 0)}. (Hint: we already 
explained one inclusion.) 


224 


31. 


32. 


33. 


34. 


7. Banach algebras 


(b) Deduce that the map f — f(x) is an isometric *-isomorphism from 
the C*-subalgebra {f € C(a(a)); f(0) = 0} of C(a(a)) onto B. 


Let A be a Banach algebra. 
(a) Prove that the set of all weakly almost periodic elements in A* is a 
closed linear subspace of A*. 
(b) Prove that for w € A*, the following conditions are equivalent: 
(i) w is weakly almost periodic; 


(ii) whenever {a;} and {b;} are bounded nets in A, the equality 
lim (tim (ast) = lim (lim w(aib;)) 
a j j a 


holds provided that the limits in both sides exist; and 


(iii) same as (ii), but with sequences in lieu of nets. 
(Hint for (iii) ==> (i): assume that (i) is false and construct by 
recursion two sequences that make (iii) false.) 


In this exercise we identify cé with 11 as in Exercise 8 of Chapter 4 and (/')* 
with /°° as in Theorem 4.6, and thus also cj* with /°° (all identifications 
are by isometric isomorphisms). 


(a) Prove that for every a € co the canonical image &(a) € 1° equals a, 
viewed as an element of [°°. That is, « is the inclusion map. 

(b) Let a € ce, w € I’ and X € I~. Using the notation of Section 7.5, 
what are aw,wa € lI’, Xw,wX €l' and XOY,XQOY € 1%? Show 
that co is Arens regular. 


The Banach space /'(Z) becomes a Banach algebra with the product being 
convolution *: the convolution of a,b € I'(Z) is a*b € I'(Z) given by 
(a * b), = Miez ar—o1 (k € Z). 


(a) Define sequences {p,}°, and {q@m}9°_, in Z by py := 22", gm := 
2?™+1 (n,m € N). Observe that 


{Pn + amin <m}O {pn + amin > m} = 9. (1) 


(b) Let {p,}°2, and {dm }9°_, be sequences in Z satisfying (1). For n,m € 
N, let an, bm € U'(Z) be the indicator functions of the singletons {p,,} 
and {qm}, respectively. Let w € 1°(Z) be the indicator function of 
the set {pn + qm3n < m}. Identifying 1°(Z) with (I+(Z))*, prove that 
w does not satisfy Condition (iii) of Part (b) of Exercise 31, so that 
it is not weakly almost periodic. This proves that [1(Z) is not Arens 
regular. 


Use the idea of the Exercise 33 to show that the Banach algebra L'(R) 
(with convolution as product, see Exercise 14) is not Arens regular. 


8 


Hilbert spaces 


Recall that a Banach space is called a Hilbert space when its norm is induced 
by an inner product. In contrast to general Banach spaces, whose geometry 
can be very complicated, Hilbert spaces have a clear and simple geometry. This 
is demonstrated, first and foremost, by the results proved in our first look at 
Hilbert spaces in Section 1.7, namely, the distance theorem, the orthogonal 
decomposition theorem, and the “Little” Riesz representation theorem. 

This chapter continues the study of Hilbert spaces, the first central 
notions being orthonormal sets and bases. We give several characterizations of 
orthonormal bases, and prove that they always exist and that all orthonormal 
bases of a specific Hilbert space X have the same cardinality, called the Hilbert 
dimension of X. Along the way we introduce projections and particularly 
orthogonal projections and see how they are related to orthonormal sets. 

We define and characterize Hilbert space isomorphisms and find a canonical 
model for Hilbert spaces: every Hilbert space is isomorphic to some L?-space. 

Two basic ways to construct Hilbert spaces are presented, namely direct sums 
and tensor products. The latter construction is essential in operator algebras and 
we rely on it in Chapter 13. 


8.1 Orthonormal sets 


We recall that the vectors x,y in an inner product space X are (mutually) 
orthogonal if (x,y) = 0 (cf. Theorem 1.35). The set A C X is orthogonal if any 
two distinct vectors of A are orthogonal; it is orthonormal if it is orthogonal and 
all vectors in A are unit vectors, that is, if 


(a,b) =dan (a,b € A) 


where d,,, is Kronecker’s delta, which equals zero for a # b and equals one for 
a=b. 

A classical example is the set A = {e™;n € N} in the Hilbert space 
X = L?([0,27]) with the normalized Lebesgue measure dt/27. 


Introduction to Modern Analysis. Second Edition. Shmuel Kantorovitz and Ami Viselter, Oxford University Press. 
© Shmuel Kantorovitz and Ami Viselter (2022). DOI: 10.1093/os0/9780192849540.003.0008 


226 8. Hilbert spaces 


Lemma 8.1 (Pythagoras’s theorem). Let {x,;k = 1,...,n} be an orthogonal 
subset of the inner product space X. Then 


n n 
do eel? = So Meal? 
k=1 k=1 


Proof. By “sesqui-linearity” of the inner product, the left-hand side equals 


Since (x,,2;) = 0 for k # j, the last sum equals 3°, (xx, 2%) = >, ||vxll?- 


If A:= {a1,...,d,} C X is orthonormal and {\j,...,An} C C, then (taking 
LE = Apay in Lemma 8.1), 


| So Aga |l? =e Dal* (1) 
k=l k=l 


Theorem 8.2. Let {ay;k = 1,2,...} be an orthonormal sequence in the Hilbert 
space X, and let A := {Ax;k = 1,2,...} be a complex sequence. Then 

(a) The series >, Ana converges in X iff \|A\|3 := >>, |An|? < 00. 

(b) In this case, the shown series converges unconditionally in X, and 


I| a, Ae Gell = ||Alla- 


Proof. By (1) applied to the orthonormal set {@m4i,...,@n} and the set of 
scalars {Am+1,---,An} with n > m > 0, 


n 2 n 
S> deael| = 32 Dal? (2) 
k=m+1 k=m+1 


This means that the series 5> A,ay satisfies Cauchy’s condition iff the series 
S> |Ax|? satisfies Cauchy’s condition. Since X is complete, this is equivalent to 
Statement (a) of the theorem. 
Suppose now that ||Allz2 < oo, and let then s € X denote the sum of the 
series >, Axa@x. Taking m = 0 and letting n — oo in (2), we obtain ||s|| = || Allo. 
Since (-,2) is a continuous linear functional (for any fixed 2 € X), we have 


(s,2) = S- Ag (Qk, £). (3) 
k=1 


If x : N — Nis any permutation of N, the series )>,, |Ax(x)|? converges to ||A||3 by 
a well-known property of positive series. Therefore, by what we already proved, 


8.1. Orthonormal sets 227 


the series )>,, An(K Gx(k) converges in X; denoting its sum by ¢, we also have 
\|t|| = ||All2, and by (3), for any « € X, 


2) = > An(ey (An(e) 2) 
k 


Choose = a, for 7 € N fixed. By orthonormality, we get (t,a;) = A,;, and 
therefore, by (3) with x = t, 


(t, s) GB = Do relat) (ax, t = Dal t, dx) =o |x|? = |JAlI3.- 
k 


Hence ||¢ — s||? = ||t||? — 2R(¢, s) + |||]? =0, and t = s. 


Lemma 8.3 (Bessel’s inequality). Let {a1,...,an} be an orthonormal set in 
the inner product space X. Then for allx € X, 


Yo l(@,ax))? < llel?. 
k 
Proof. Given x € X, denote y := }°,.(x, ax)ax. Then by (1) 


(y,2—y) = (2) — (yy) = Do (a, ae) (ax, 2) — lIyl|? = Dla) *— llyll? =0 


k 


Therefore, by Lemma 8.1, 


lz? = I@ —y) +9? = lle — yl? + el? = My = So @ ae)? 


Corollary 8.4. Let {a,;k = 1,2,...} be an orthonormal sequence in the Hilbert 
space X. Then for anyx€ X, the series do, (@, Gn )ag converges unconditionally 
in X to an element Px, and P € B(X) has the following properties: 


(a) ||Pl| =1; 
(Pea Ps 
(c) the ranges PX and (I — P)X are orthogonal; 
(d) P*=P. 


Proof. Let x € X. By Bessel’s inequality (Lemma 8.3), the partial sums of the 
positive series >, |(x,ax)|* are bounded by ||2||?; the series therefore converges, 
and consequently }°,(2,ax%)a@, converges unconditionally to an element Px € X, 
by Theorem 8.2. The linearity of P is trivial, and by Theorem 8.2, ||Pz||? = 
y, (a, ax)? < |lal|?, so that P € B(X) and ||P|| < 1. By (3) 


(Px, a;) =e (2, Gk) (Ax, aj) = (2,45). (4) 
k=1 


228 8. Hilbert spaces 


Therefore 
P?z = P(Pzx) = S- (Pz, a;)a; = S" (2, a;)aj = Pa: 
j j 


This proves Property (b) 

By Property (b), ||P|| = ||P?|| < || P|l?. Since P 4 0 (e.g., Pa; = a; #0 for 
all 7 € N), it follows that ||P|| > 1, and therefore, ||P|| = 1, by our previous 
inequality. This proves Property (a). By (3) with the proper choices of scalars 
and vectors, we have for all x,y © X 


(Py,) = )~(y,ax)(ax,2) = S~(,a%)(ax,y) = (Pay) = (y, Pa). 
k k 


This proves Property (d), and therefore, by Property (b), 


(Pa, (I — P)y) = (x, P(I — P)y) = (a, (P — P*)y) =0, 


which verifies Property (c). 


8.2 Projections 


This section discusses projections and, more specifically, orthogonal projections. 
The existence of general projections in Banach spaces was studied in Exercise 18 
of Chapter 6. 


Terminology 8.5. 


(a) Any P € B(X) satisfying Property (b) in Corollary 8.4 is called a 
projection (X could be any Banach space!). If P is a projection, so is 
I — P (because (I — P)? = I1-2P + P=I-—P); I — P is called the 
complementary projection of P. Note that the complementary projections 
P and I — P commute and have product equal to zero. 


: shab.} 
= 


For any projection P on a Banach space X, 
PX ={x € X;Px =x} =ker(I — P) (1) 


(if « = Py for some y € X, then Px = P?y = Py = 2). Since I — P is 
continuous, it follows from (1) that the range PX is closed. The closed 
subspaces PX and (J — P)X have trivial intersection (if « is in the 
intersection, then x = Px, and « = (I — P)x = x — Px = 0), and 
their sum is X (every x can be written as Px + (I — P)x). This means 
that X is the direct sum of the closed subspaces PX and (I — P)X. 

A closed subspace M Cc X is T-invariant (for a given T € B(X)) if 
TM Cc M. By (1), the closed subspace PX is T-invariant iff P(TPx) = 
TPzx for all « € X, that is, if TP = PTP. Applying this to the 
complementary projection J—P, we conclude that (I— P)X is T-invariant 


8.2. Projections 229 


iff 7(1—P) = (I—P)T(1—P), that is (expand and cancel!), iff PT = PTP. 
Therefore both complementary subspaces PX and (I—P) X are T-invariant 
iff P commutes with T. One says in this case that PX is a reducing 
subspace for T (or that P reduces T). 


(c) When _X is a Hilbert space and a projection P in X satisfies Property (c) in 
Corollary 8.4, it is called an orthogonal projection. In that case the direct 
sum decomposition X = PX © (I — P)X is an orthogonal decomposition. 
Conversely, if Y is any closed subspace of X, we may use the orthogonal 
decomposition X = Y @ Y+ (Theorem 1.36) to define an orthogonal 
projection P in X with range equal to Y: given any « € X, it has the 
unique orthogonal decomposition 2 = y+z with y € Y and z € Y*+; define 
Px = y. It is easy to verify that P is the wanted (selfadjoint!) projection; 
it is called the orthogonal projection onto Y. Given T € B(X), Y isa 
reducing subspace for T iff P commutes with T (by Point (b)); since P 
is selfadjoint, the relations PT = TP and T*P = PI” are equivalent 
(since one follows from the other by taking adjoints). Thus Y reduces T 
iff it reduces T*. If Y is invariant for both T and T*, then (cf. Point (b)) 
TP = PTP and T*P = PT*P; taking adjoints in the second relation, we 
get PT = PTP, hence TP = PT. As observed before, this last relation 
implies in particular that Y is invariant for both T and 7T*. Thus, if Y 
is a closed subspace of the Hilbert space X and P is the corresponding 
orthogonal projection, then for any T € B(X), the following propositions 
are equivalent: 


(i) Y reduces T; 
(ii) Y is invariant for both T and T*; 
(iii) P commutes with T. 


(d) In the proof of Corollary 8.4, we deduced Property (c) from Property (d) 
Conversely, if P is an orthogonal projection, then it is selfadjoint. Indeed, 
for any z,y € X, (Pax,(I — P)y) = 0, that is, (Px,y) = (Px, Py). 
Interchanging the roles of x and y and taking complex adjoints, we get 
(x, Py) = (Pa, Py), and therefore (Px, y) = (a, Py). Thus, a projection 
in Hilbert space is orthogonal iff it is selfadjoint. 


We consider now an arbitrary orthonormal set A in the Hilbert space X. 


Lemma 8.6. Let X be an inner product space and A C X be orthonormal. Let 
d6>0 anda € X be given. Then the set 


As(x) := {a € A; |(x,a)| > 5} 
is finite (it has at most |||2x\|?/6?] elements). 


Proof. We may assume that As(x) 4 9 (otherwise there is nothing to prove). 
Let then aj,...,@, be n > 1 distinct elements in As(a). By Bessel’s inequality, 


nd? < J |(@,an))? < lal’, 


k=1 


230 8. Hilbert spaces 


so that n < ||a||?/6?, and the conclusion follows. 


Theorem 8.7. Let X be an inner product space, and A C X be orthonormal. 
Then for any given x € X, the set 


A(z) = {a € A;(x,) £0} 
ts at most countable. 


Proof. Since ss 
A(x) = (J Aijm(2), 
m=1 


this is an immediate consequence of Lemma 8.6. 


Notation 8.8. Let A be any orthonormal subset in the Hilbert space X. Given 
x € X, write A(x) as a (finite or infinite) sequence A(x) = {ax}. Let Px be 
defined as before with respect to the orthonormal sequence {a,}. Since the 
convergence is unconditional, the definition is independent of the particular 
representation of A(a) as a sequence, and one may use the following notation 
that ignores the sequential representation: 


Pr = Ne, (x, a)a. 
ace A(x) 


Since (z,a) = 0 for all a € A not in A(x), one may add to the earlier sum the 
zero terms (x,a)a for all such vectors a, that is, 


Pxr= Se a)a. 
acA 
By Corollary 8.4, P is an orthogonal projection. 


Lemma 8.9. The ranges of the orthogonal projections P and I—P are span(A) 
(the closed span of A, i.e., the closure of the linear span of A) and At, 
respectively. 


Proof. Since Pb = b for any b € A, we have A C PX, and since PX is 
a closed subspace, it follows that span(A) C PX. On the other hand, given 
z € X, represent A(x) = {a}; then Px = 3>,,(x,ax%)ax = limn )7p_1 (2, an) an € 
span(A), and the first statement of the lemma follows. 

By uniqueness of the orthogonal decomposition, we have 


(I P)X = (span(A)) (2) 


Clearly, x € At iff A C ker(-,2). Since (-,2) is a continuous linear functional, 
its kernel is a closed subspace, and therefore, the last inclusion is equivalent to 
—— RE Fi 

span(A) C ker(-, x), that is, to x € (span(4)) . This shows that the set on the 
right of (2) is equal to A+. 


8.3. Orthonormal bases 231 


8.3. Orthonormal bases 


Theorem 8.10. Let A be an orthonormal set in the Hilbert space X, and let P 
be the associated projection. Then the following statements are equivalent: 


(1) At = {0}. 

(2) If AC BCX and B is orthonormal, then A= B. 
(3) X = span(). 

(Q)Pst. 

(5) Every x € X has the representation 


L= So (a, a)a. 


acA 


(6) For every x,y € X, one has 


(x,y) at S- (2, a)(y, 4), 


acA 
where the series, which has at most countably many non-zero terms, 
converges absolutely. 


(7) For every « € X, one has 


lle? = $2 (za)? 


acA 
Proof. 


(1) implies (2). Suppose A C B with B orthonormal. If B 4 A, pick b € B, 
b ¢ A. For any a € A, the vectors a,b are distinct (unit) vectors in the 
orthonormal set B, and therefore (b,a) = 0. Hence, b € A+, and therefore 
b = 0 by (1), contradicting the fact that b is a unit vector. Hence B = A. 


(2) implies (3). If (3) does not hold, then by the orthogonal decomposition 
theorem and Lemma 8.9, the subspace A+ is non-trivial, and contains 
therefore a unit vector b. Then b ¢ A, so that the orthonormal set 
B := AU {b} contains A properly, contradicting (2). 

(3) implies (4). By (3) and Lemma 8.9, the range of P is X, hence, Px = x 
for all x € X (cf. Point (b) in Terminology 8.5), that is, P = I. 

(4) implies (5). By (4), for alla ¢ X, x =Ir = Pr =)0,¢4(#,a)a. 


(5) implies (6). Given x,y € A and representing A(x) = {a;}, we have by the 
Cauchy-Schwarz inequality in the Hilbert space C” (for any n € N): 


1/2 


do |(@, ax)(y, ae)| < (s>1 x, a)| ‘ (Sim) < llellllyl, 
k=1 k=1 k=1 


232 8. Hilbert spaces 


where we used Bessel’s inequality for the last step. 

Since the partial sums of the positive series )°, |(x,ax)(y, ax)| 
are bounded, the series 5°, (2,ax)(y,a,) converges absolutely, hence 
unconditionally, and may be written without specifying the particular 
ordering of A(x); after adding zero terms, we finally write it in the form 
aca(#, a)(y,a). Now since (-,y) is a continuous linear functional, this 
sum is equal to 


S" (2, ak)(an, 9) = (Tt anew) = (x,y); 


k k 


where we used (5) for the last step. 
(6) implies (7). Take x = y in (6). 
(7) implies (1). If x € A+, (x, a) = 0 for alla € A, and therefore |||] = 0 by (7). 


Terminology 8.11. Let A be an orthonormal set in X. If A has Property (1), it 
is called a complete orthonormal set. If it has Property (2), it is called a maximal 
orthonormal set. If it has Property (3), one says that A spans X. We express 
Property (5) by saying that every x € X has a generalized Fourier expansion 
with respect to A. Properties (6) and (7) are the Parseval and Bessel identities 
for A, respectively. 

If A has any (and therefore all) of these seven properties, it is called an 
orthonormal basis or a Hilbert basis for X. 


Example. (1) Let [ denote the unit circle in C, and let A be the subalgebra of 
C(I) consisting of the restrictions to I of all the functions in span{z*;k € Z}. 
Since Z = zt on T, A is selfadjoint. The function z € A assumes distinct 
values at distinct points of I’, so that A is separating, and contains 1 = z°. By 
Theorem 5.39, A is dense in C(I). 

Given f € L?(C) (with the arc-length measure) and € > 0, let g € C(I) be 
such that || f — gll2 < €/2 (by density of C(I) in L?(T)). Next let p € A be such 
that |lg — plc) < €/2V2m (by density of A in C(L)). Then ||g — pll2 < €/2, 
and therefore || f — p||2 < ¢. This shows that A is dense in L?(I'). Equivalently, 
writing z = e!* with x € [-7,7], we proved that the span of the (obviously) 


orthonormal sequence 
elke 
——keZ i 
{eikezh (*) 


in the Hilbert space L?(—7,7) is dense, that is, the sequence (*) is an 
orthonormal basis for L?(—zx,7). In particular, every f € L?(—7,7) has the 
unique so-called L?(—7,7)-convergent Fourier expansion 


ik 
f= )dicxe™, 


keZ 


8.3. Orthonormal bases 233 


with 


Cr = (1/27) f(x)e** dx. 

By Bessel’s identity, 

Yo eal? = (1/27) LF I12- a) 

kez 
Take, for example, f = I(o,x). A simple calculation shows that cp = 1/2 and 
cy = (1 —e7'*") /2Q7ik for k # 0. Therefore, for k 4 0, |cx|? = (1/477k?)(2 
2cos km) vanishes for k even and equals 1/m?k? for k odd. Substituting in (**), 
we get 1/4 + (1/m?) >), oaq(1/k?) = 1/2, hence 


S> (1/k?) = 0?/4. 


kodd 
If a := O72, (1/k?), then 
S> (A/k?) = $5 (1/49?) = 0/4, 
keven>2 j=l 


and therefore 


a= > (1/k’)+a/4=7°/8+a/4. 


kodd>1 


Solving for a, we get a = 77/6. 
(2) Let 


eikax 


Jalon) (x) 


The sequence { f;,} is clearly orthonormal in L?(R). If f belongs to the closure of 
span { f,} and € > 0 is given, there exists h in the span such that || f — h||3 < ¢. 
Since h = 0 on (—7,77)°, we have 


[i ltPar= fo p-aPar<e, 
(-1,7)° (—n,7)° 


and the arbitrariness of epsilon implies that the integral on the left-hand side 
vanishes. Hence f = 0 a.e. outside (—7,7) (since f represents an equivalence 
class of functions, we may say that f = 0 identically outside (—a,7)). On the 
other hand, the density of the span of {e'“*} in L?(—7,7) means that every 
f € L?(R) vanishing outside (—7,7) is in the closure of span{f,} in L?(R). 
Thus {f;,} is an orthonormal sequence in L?(R) which is not an orthonormal 
basis for the space; the closure of its span consists precisely of all f € L?(R) 
vanishing outside (—1, 7). 


fr(x) = (ke Z,a ER). 


Theorem 8.12 (Existence of orthonormal bases). Every (non-trivial) 
Hilbert space has an orthonormal base. 


234 8. Hilbert spaces 


More specifically, given any orthonormal set Ao in the Hilbert space X, it 
can be completed to an orthonormal base A. 


Proof. Since a non-trivial Hilbert space contains a unit vector ao, the first 
statement of the theorem follows from the second with the orthonormal set 
Ao = {ag}. 

To prove the second statement, consider the family A of all orthonormal sets 
in X containing the given set Ao. It is non-empty (since Ap € A) and partially 
ordered by inclusion. If A’ C A is totally ordered, it is clear that L)A’ is an 
orthonormal set (because of the total order!) containing Ao, that is, it is an 
element of A, and is an upper bound for all the sets in A’. Therefore, by Zorn’s 
lemma, the family A contains a maximal element A; A is a maximal orthonormal 
set (hence an orthonormal base, cf. Section 8.11) containing Ao. 


8.4 Hilbert dimension 


Theorem 8.13 (Equi-cardinality of orthonormal bases). All orthonormal 
bases of a given Hilbert space have the same cardinality. 


Proof. Note first that an orthonormal set A is necessarily linearly independent. 
Indeed, if the finite linear combination x := ay Apa, of vectors a, € A 
vanishes, then A; = (x,a,;) = (0,a;) = 0 for all j =1,...,n. 

We consider two orthonormal bases A, B of the Hilbert space X. 

Case 1. At least one of the bases is finite. 

We may assume that A is finite, say A = {a1,...,@n}. By Property (5) of the 
orthonormal base A, the vectors a1,...,@n span X (in the algebraic sense!), and 
are linearly independent by the previous remark. This means that {a1,...,a@n} is 
a base (in the algebraic sense) for the vector space X, and therefore the algebraic 
dimension of X is n. Since B C X is linearly independent (by the previous 
remark), its cardinality |B] is at most n, that is, |B] < |A|. In particular, B is 
finite, and the preceding conclusion applies with B and A interchanged, that is, 
|A| < |B], and the equality |A| = |B| follows. 

Case 2. Both bases are infinite. 

With notation as in Theorem 8.7, we claim that 


A= |J Ald). 


beB 


Indeed, suppose some a € A is not in the shown union. Then a ¢ A(b) for all 
b € B, that is, (a,b) = 0 for all b € B. Hence, a € B+ = {0} (by Property (1) of 
the orthonormal base B), but this is absurd since a is a unit vector. 

By Theorem 8.7, each set A(b) is at most countable; therefore the union 
above has cardinality < No x |B|. Also Xo < |B| (since B is infinite). Hence | A] < 
|B|? = |B| (the last equation is a well-known property of infinite cardinalities). 
By symmetry of the present case with respect to A and B, we also have |B| < |A|, 
and the equality of the cardinalities follows. 


8.5. Isomorphism of Hilbert spaces 235 


Definition 8.14. The cardinality of any (hence of all) orthonormal bases of the 
Hilbert space X is called the Hilbert dimension of X, denoted dimy X. 


8.5 Isomorphism of Hilbert spaces 


Two Hilbert spaces X and Y are isomorphic if there exists an algebraic 
isomorphism V : X — Y that preserves the inner product: (Vp,Vq) = (p,q) 
for all p,q € X (the same notation is used for the inner product in both spaces). 

The map V is necessarily isometric (since it is linear and norm-preserving). 
Conversely, by the polarization identity (cf. (11) following Definition 1.34), 
any bijective isometric linear map between Hilbert spaces is an isomorphism 
(of Hilbert spaces). Such an isomorphism is also called a unitary equivalence; 
accordingly, isomorphic Hilbert spaces are said to be unitarily equivalent. 

The isomorphism relation between Hilbert spaces is clearly an equivalence 
relation. Each equivalence class is completely determined by the Hilbert 
dimension: 


Theorem 8.15. Two Hilbert spaces are isomorphic iff they have the same Hilbert 
dimension. 


Proof. If the Hilbert spaces X,Y have the same Hilbert dimension, and A, B 
are orthonormal bases for X and Y, respectively, then since |A| = |B], we can 
choose an index set J to index the elements of both A and B: 


A= {aj;;j € J}; B= {bj;7 € J}. 


By Property (5) of orthonormal bases, there is a unique continuous linear 
map V: X + Y such that Va; = 6; for all j € J (namely Va = 90565 (a, a; )b; 
for all a € X). There is also a unique continuous linear map W : Y > X such 
that Wb; = a; for all 7 € J. Clearly, VW = WV = I, where J denotes the 
identity map in both spaces. Therefore, V is bijective, and by Parseval’s identity 


(Var, Vy) = | $\(@,a5)b;, Soy, ay); | = So (@, as), a3) = (2,9) 


jEd jEd jet 


for all x,y € X. Thus V is a Hilbert space isomorphism of X onto Y. 
Conversely, suppose the Hilbert spaces X and Y are isomorphic, and let then 
V : X — Y be an isomorphism (of Hilbert spaces). If A is an orthonormal 
base for X, then VA is an orthonormal base for Y. Indeed, VA is orthonormal, 
because 
(Vs, Vt) = (s, t) = Os,t = OVs,Vt 


for all s,t € A (the last equality follows from the injectiveness of V). In order 
to show completeness, let y € (VA)+ (in Y), and write y = Vx for some 2 € X 
(since V is onto). Then for alla € A 


(x, a) = (Va, Va) = (y, Va) = 0, 


236 8. Hilbert spaces 


that is, 2 € A+ = {0} (by Property (1) of the orthonormal base A). Hence, y = 0, 
and we conclude that VA is indeed an orthonormal base for Y (cf. Section 8.11). 
Now by Theorem 8.13 and the fact that V : A > VA is bijective, dimy 
=|VA| = |A| =dimy X. 


8.6 Direct sums 


Let (Xs, (-,+)s)ses 
pointwise operations) of all f € [] 
at most countable, and 


be a family of Hilbert spaces. Let X be the vector space (under 


secs Xs Such that S(f) := {s € S; f(s) £0} is 


FI? = SO IF(S)IE < 00. 


sES 


By the Cauchy—Schwarz inequality in C”, if f,g © X, the series 


(f,9) = S-(F(s), 9(s))s 


sES 


converges absolutely (hence unconditionally), and defines an inner product on 
X with induced norm || - ||. Let {fn} be a Cauchy sequence with respect to this 
norm. For all s € S, the inequality ||f|| > || f(s)|l; (f € X) implies that {fn(s)} 
is a Cauchy sequence in the Hilbert space X,. Let f(s) := lim, fn(s) € Xz. 
Then f € [],¢5 Xs and S(f) CU, S(fn) is at most countable. Given € > 0, let 
no € N be such that || fn — fm|| < € for all n,m > no. By Fatou’s lemma for the 
counting measure on S, for n,m > no, 


llfn — FIP = Do Mfn(s) — f(s) = D0 limint || f(s) — fin (S)II3 
#68. scS 


< lim inf} 7 Ge (s)|[> = liminf|| fn — fal? <2. 
seS 


This shows that f = fn—(fn—f) € X and f, > f in the X-norm, so that X is 
a Hilbert space. It is usually called the direct sum of the Hilbert spaces (Xs) 


and is denoted 
X=) oX,. 
sES 


sES 


The elements of X are usually denoted by }).-.5 xs (rather than the functional 
notation f). In this section we maintain the preceding notation for simplicity of 
symbols. 

Suppose that for every s € S, T, is an element of B(X,), and that the family 
{Ts}cg is uniformly bounded, that is, sup,¢g ||T's|| B(x.) < 00. We define a new 
map T': X + X by 


(Tf)(s) =Ts(f(s)) (fe X,8 € 8). 


8.8. Tensor products 237 


Clearly Tf € [cg Xs and S(Tf) C S(f) is at most countable. Also, 


ITF? = DU ITS(F(S))IE < 


sES sES 


F(s)I2 < (sup) WP < oo. 


Therefore Tf € X, and T is a bounded linear operator on X with operator norm 
< supseg ||T;||. Actually, we have ||Z'| = sup,¢g ||T5||. Indeed, for each s € S 
and v € X., consider the function f,,, € X defined by 


fe,v(8) =U; fev(t):=0 (t€S8,t#s). 


Then || fs,u|| = ||v||s and ||T'fs,v|| = ||Tsvl|s. Hence, ||Z|| > ||Ts|| for all s € S, and 
therefore ||T|| > sup,egs ||Ts||. Together with the reverse inequality, we obtain 
the desired equality ||T|| = sup,<g ||Ts||. 

The usual notation for the operator T is }),-5 @T7s. It is called the direct 


sum of the operators (Ts) ,cs- 


8.7 Canonical model 


Given any cardinality y, choose a set J with this cardinality. Consider the vector 
space 1?(J) of all functions f : J > C with J(f) := {7 € J; f(7) 4 0} at most 
countable and 


1/2 
Ifle=[SLu@P) <x. 
JET 
The inner product associated with the norm || - ||2 is defined by the absolutely 


convergent series 
= TVD (fg € 2). 
jed 

The space /?(.J) is the Hilbert direct sum of copies C; (j € J) of the Hilbert space 
C (cf. previous section). In particular, [?(J) is a Hilbert space (it is actually 
the L? space of the measure space (J,P(J),), with the counting measure, 
that is, equal to one on singletons). For each j € J, let a; € I?(J) be defined 
by a;(2) = 6:,;(¢,9 € J). Then A := {a;;7 € J} is clearly orthonormal, and 
(f,a;) = f(j) for all f € 1?(J) and j € J. In particular A+ = {0}, so that A is 
an orthonormal base for /?(J). Since |A| = |J| = 7, the space /?(J) has Hilbert 
dimension y. By Theorem 8.15, every Hilbert space with Hilbert dimension ¥ is 
isomorphic to the “canonical model” /?(J). 


8.8 ‘Tensor products 


8.8.1 An interlude: tensor products of vector spaces 


Fix vector spaces U,V over the same field F. Recall that their direct sum is a 
vector space U @ V over F containing “copies” of U,V (as subspaces) whose sum 


238 8. Hilbert spaces 


equals U @ V and is direct: UM V = {0}. Similarly, the tensor product vector 
space U ® V will be spanned by formal products of the form u® v in such a way 
that the map U x V 3 (u,v) > u®v CU @Y be bilinear, without any further 
relation. 

An equivalent point of view is that of bases. Recall that a subset A C U is 
called a Hamel base of U if it is linearly independent, namely, every finite subset 
of A is linearly independent, and spanning, that is, that each element of U can be 
expressed as the linear combination of finitely many elements of A; equivalently, 
each element of U can be expressed uniquely as the linear combination of finitely 
many elements of A. If A, B are Hamel bases of U, V, respectively, recall that the 
disjoint union AU B is a Hamel basis of U @ V; it will follow from our definition 
that A@ B := {a@b;a € A,b © B} is a Hamel basis of U @V. 

Consider the set T(U, V) of all formal linear combinations of elements of the 
cartesian product U x V: 

S- Quy [U; U] 


ucU,vEeV 


with Qu, € F and the support {[u,v] ©eU x V:au,. #0} being finite. It 
becomes a vector space over F when endowed with the following operations: 


B- y Quv [U, U] = se Bou [U,V] 
Soak, [uo] + Yr? [uso] = So (ak, +02) [ua]. 


Note that the zero element is }> 0 [u,v]. For convenience, given u € U andv € V, 
write [u,v] for >>), ,, du,u4v,w [u’,v’] € T(U,V), where 6 stands for Kronecker’s 
delta. 

Define To(U, V) to be the linear span in T(U,V) of all elements of the form 


nm m nm m 
de atti, 98545 | — DTD 18; [wis vy] 
i=1 j=l 


i=1 j=1 


with n,m EN, uy,...,Un € U, V1,...,Um € V, and ay,...,Qn,1,.--, 0m € F, 
which equals the linear span of all elements of one of the following forms: 


alu, v] — [au, v], a [u,v] — [u, av], 


fut+u’,v] — [u,v] — [u’, v] , fuyy tu’) — [u,v] — [u,v], 


for uu’ € U, v,v’ € V, and a,8 € F. The quotient vector space 
T(U,V)/To(U,V) is called the tensor product of U and V and is denoted by 
U @V. We write u ® v for the coset of [u,v] in U @ V, and call such elements 
simple tensors. 

Thus, U ®V is spanned by the set of simple tensors and we have the relations 


(>: ow @ | Dd) Bis | = D7 Dd aiBj (en @ vy). 
i=1 j=l 


i=1 j=l 


8.8. Tensor products 239 


We list the basic properties of tensor products of vector spaces. Recall that 
if {Uatae, is a family of vector spaces over the same field F, then their direct 
sum is the set 


Ss, QU, := { Ge ee ee II Ua; Ue = 0 for all but finitely many a € A} 
acA acA 


with the pointwise operations. We will denote {ua}ge, © aca BUa by 
ae ®ua- 


Proposition 8.16. Let U,V,W,Z be vector spaces over the same field F. 


(i) Commutativity: the map u®v > v®u (uc U, v € V) extends to a 
canonical isomorphism between the vector spaces U @V and V @U. 

(ii) Associativity: the map (u@v)@w 7 uS(v@w) uc u,ve’, 
w € W) extends to a canonical isomorphism between the vector spaces 
(U@V)@W andUS(VEW). 

(iii) Distributivity: the map (u+v)@w 7 u®wt+vew (UEU,veE’V, 
w € W) extends to a canonical isomorphism between the vector spaces 
(USV)@W and(UQEW)S(V@W). 

(iv) For each pair of linear maps T: U > W and S:V — Z there exists a 
unique linear map T®S:U@V AW ®Z that maps u®v to Tu® Sv 
for eachuEeU andveV. 

(v) For every linear functional @ on U there exists a unique linear map 
@@1I:UQ@V—YV that maps u®v to d(u)u for eachuce U andv eV. 

(vi) Let uz,...,Un € U be linearly independent and v1,...,Un € V. Then 
Dee u;, ® vu; = 0 iff vy = ve Seo = U7 = 0: 

(vii) Let {eabye, be a Hamel base of U. The map ic, 2V 4 U®V given 
by re Bvag > Ped Ca ® Vo, is an isomorphism of vector spaces. 


viti) If {eo and { f, are Hamel bases of U and V, respectively, then 
acA BS BEB 
{eq ® Sab (a ayeAxp % 4 Hamel base of U @V. In particular, dim(U @ 
V) =dim(U) - dim(V). 
(ix) Let uy,...,Un €U and v1,...,Un € V. Then So, us ® vi = 0 iff there 
exists a matrix (dij )1<; <n € Mn(F) such that 


Sraiuj=u (VISi<n), (1) 
j=l 
Sra, =0 (WIS i <n). (2) 
i=l 


Proof. (i), (ii), and (iii) are left to the reader as simple exercises. 
(iv) Consider the linear map from T(U, V) to T(W, Z) sending )> ayy [u, v] to 


u,v [Tu, Sv] (where as usual, the coefficient family {Quo} cy yey has finite 


240 8. Hilbert spaces 


support). It maps To(U, V) into To(W, Z). Consequently, it induces a linear map 
from U @V = T(U,V)/To(U, V) to W ® Z = T(W, Z)/To(W, Z), which has the 
desired property. Uniqueness is a result of the simple tensors spanning the tensor 
product. 

(v) Follows readily from Part (iv) and the natural identification of F @ V 
with V. 

(vi) Sufficiency is clear. Assume that 57>", u;@v; = 0. By linear independence 
of u1,..., Un there exist linear functionals ¢1,...,¢n on U such that $;(ui) = 6:; 
for all 1 < i,j <n. Applying ¢; ® I of Part (v) to both sides of the equation 
ie ti @ uy; = 0 gives v; = 0. 

(vii) The map is clearly linear and onto. It is injective by Part (vi). 

(viii) Straightforward either from Part (vi) or from Part (vii). 

(ix) Sufficiency is clear. To prove necessity, assume that )>\"_, uj; @ vj = 0. 
Let {es}jny be a Hamel base of span {u1,...,Un} in U (som < n). Let B= 
(Bik) l<i<n € Mnxm(F) and C = (Yai )1<k<m E Minxn(F) be such that u; = 

1<k<m 1<i<n 
p-1 Bixee for all 1 < i < nm and eg = D0, Yeu; for all 1 < k < m. Set 
A = (ij) ei j<n “= BC € M,(F). Then evidently (1) holds. Furthermore, 
1 Ui ®@ v; = 0 entails that 


0= S- ( baer] Gu= S- ep @ (>: aus 
k=1 pat i=1 


i=l \k= 


by the definition of the tensor product. Hence, using Part (vi) we obtain 
1 Ginvi = 0 for every 1 < k < m. This implies (2), as for 1 < j < n, 


n 


n m m n 
So agus = 0 Bixtegvi = 0 10 (>: aus = 0. 
i=1 k=l i=1 


i=l k=1 


8.8.2 Tensor products of Hilbert spaces 


Here we present the Hilbert space counterpart of the tensor product of vector 
spaces. 

Henceforth, the tensor product of two inner product spaces X,Y as complex 
vector spaces, as defined in Section 8.8.1, will be called their algebraic tensor 
product and will be denoted by X @aig Y. 


Proposition 8.17. Let X,Y be inner product spaces. There exists on X Qaig Y 
a unique inner product (-,-) satisfying (cx @ y, 2’ ® y’) := (x, 2')y-(y,y')y for all 
a,v’ EX andy,y' EY. 


Proof. Uniqueness is obvious as the algebraic tensor product is spanned by 
simple tensors. The existence of a sesquilinear map satisfying the required 
condition is easy: one defines it first at the level of T(X,Y), and makes sure 
that the result is zero when either variable belongs to To(X, Y). 

To prove positive definiteness, note that every element of X @ajg Y can 
be written as w = DY, oj-1 auje: ® fj, where {e;}_, and {fj}\", are 


8.8. Tensor products 241 


orthonormal in X and Y, respectively, and the a;; are scalars. We then have 
(w,w) = ye ja |a;;|". This number is evidently non-negative, and equals 
zero only when w = 0. 


Let X,Y be Hilbert spaces. The tensor product Hilbert space X @ Y is the 
completion of the inner product space X ®aig Y. 
Recall the Hilbert space direct sum construction of Section 8.6. 


Proposition 8.18. Let X,Y,Z,W be Hilbert spaces. 


(i) Commutativity: the map x ®y > y@a (x € X, y € Y) extends to a 

canonical isomorphism between the Hilbert spaces X ®Y and Y © X. 

(ti) Associativity: the map (x®y)@z7x@(y@z) @eEX, yEeY,z€Z) 
extends to a canonical isomorphism between the Hilbert spaces (X @Y)®Z 
and X @(Y @Z). 

(iti) Distributivity: the map (wt+y)®@z7u@z+y@z (teEX,yEeY,zE€Z) 
extends to a canonical isomorphism between the Hilbert spaces (X BY) ®Z 
and (X ®Z)@(Y @Z). 

(iv) Let {ea}ayea be an orthonormal base of X. The map oye, PY + X @Y 
given by doaca PYa > YViaca Ca ® Ya is an isomorphism of Hilbert spaces. 


(v) If {eatsaea and {fahsep are orthonormal bases of X and Y, respectively, 
then {eq ® YB} (a,8)eAxB is an orthonormal base of X ®Y. In particular, 
(vi) For all T € B(X,W) and S € B(Y,Z) there exists a unique map T@S € 


B(X @Y,W @ Z) that maps «@®y toTx® Sy for eachae X andycY. 
We have ||T ® Sl = IIT'IIISIl 


Proof. We leave (i), (ii), and (iii) as exercises. 

(iv) Let yoyc4 P¥a © Vaca SY~ Recall that (it is countably supported 
and) dive \vell” < 00. The family {e, ® Yataca in X ® Y is orthogonal and 
Maca llea ® Yall” = aca llyall’. This implies that the map ®: 1.4 ®Y 7 
X @Y given by )oac4 BY¥a > Yigca Ca ® Yo is well defined and isometric. On 
the other hand, every simple tensor x®y can be written as )) ye 4 (X, €a) €a@y = 
acd Ca ® (£,€a) y, So the image of ® contains the dense subspace X @aig Y 
of X @ Y. But this image is closed because © is isometric and )> OY isa 
Hilbert space. Hence, ® is onto. 

(v) Follows from Part (iv). 

(vi) The linear map from ),¢4@Y to oye, 0Z given by oye, PYa > 
aca PSYa is well defined and bounded by ||S||, because 7, <4 Syoll? < 


ISI? Daca llvall” = [ISI ||Xaca P¥el|” (in fact, its norm is precisely ||.||). 
Using the Hilbert space isomorphism of Part (iv) we obtain a map I @ S with 
the desired property. By symmetry, we also get T ® J. Composing them yields 
the map T ® S and we have ||T ® S|] < ||T'| || S|]. Since ||xz @ y|| = ||| ||y|| for all 
x €X and y € Y, the reverse inequality ||T @ S|| > ||T| ||S|| is obvious. 


acA 


242 8. Hilbert spaces 


Exercises 


The trigonometric Hilbert basis of L? 


1. Let m denote the normalized Lebesgue measure on [—7, 7]. (The integral 
of f € L*(m) over this interval will be denoted by f f dm.) Let 
ex (x) = e'** (k € Z), and denote 


= S- ex (n=0,1,2,...) 
lk|<n 


n-1 


On =(1/n) >> 3; (neEN). 


j=0 


Note that o1 = s9 = e9 = 1, {ex;k € Z} is orthonormal in the Hilbert 
space L?(m), and 


i dm = (8n,€0) = be (€x, eo) = 1 


|k| <n 


enam = cn) 9 fa dm = 1. 


Prove 
(a) 
_ cos(nx) —cos(n+ 1) _ sin(n+ 1/2) 
Sn{(z) = 1—cosz ~  sin(a/2) 
(b) 


on(x) = (1m) Sestne) =) (ney) | 


(c) {on} is an “approximate identity” in the sense of Exercise 6, 
Chapter 4. Consequently, o, *« f — f uniformly on [—7,7] if f € 
C([-7,7]) and f(a) = f(—7), and in L?-metric if f € L? := L?(m) 
(for each p € [1,00)) (cf. Exercise 6, Chapter 4). 


(d) Consider the orthogonal projections P and P,, associated with the 
orthonormal sets {e,;k € Z} and {e,;|k| < n} respectively, and 
denote Q,, = (1/n) ae P; for n EN. 

Terminology: Pf := )l,e7(f, ex)ex is called the (formal) Fourier 
series of f for any integrable f (it converges in L? if f € L7); (f, ex) 
is the kth Fourier coefficient of f; P,f is the nth partial sum of the 
Fourier series for f; Qf is the nth Cesaro mean of the Fourier series 
of f. 

Observe that P, f = s, * f and Q,f = on * f for any integrable 
function f. Consequently, Qn f — f uniformly in [—7, 7] if f € Cr := 


Exercises 243 


{f € C([-7,7]); f(r) = f(—7)}, and in L?-norm if f € L? (for each 
p € [l,oo)). If f € L© := L*(-2,7), Qnf — f in the weak*- 
topology on L® (cf. Exercise 6, Chapter 4). 

(ce) {ex;k € Z} is a Hilbert basis for L?(m). (Note that Q,f € span{e,} 
and use Part (d).) 


Fourier coefficients 


2. Use notation as in Exercise 1. Given k € Z > cy © C, denote g, = 
Dikl<n Cher (n = 0,1,2,...) and G, = (1/n) SS g; (n € N). Note 
that (gn,@m) = Cm for n > |m| and = 0 for n < |m|, and consequently 
(Gn, ek) = (1 —|k|/n)c, for n > |k| and = 0 for n < |k]. 


(a) Let p € (1, co]. Prove that if 


M :=sup||Gullp < 0, 
n 


then there exists f € L? such that c, = (f,ex) for all k € Z, 
and conversely. (Hint: the ball B(0, 4) in L? is weak*-compact, cf. 
Theorems 5.24 and 4.6. For the converse, see Exercise 1.) 

(b) If {G,} converges in L1-norm, then there exists f € L' such that 
ce = (f, ex) for all &k € Z, and conversely. 

(c) If {G,,} converges uniformly in [—7, 7], then there exists f € Cr such 
that cy, = (f,ex) for all k € Z, and conversely. 

(d) Ifsup,, ||Gnll1 < oo, there exists a complex Borel measure jz on [—7, 7] 
with p({7}) = w({—7}) (briefly, p € Mr) such that cy = f ex du for 
all k, and conversely. (Hint: consider the measures di, = G, dm, and 
apply Theorems 4.9 and 5.24.) 


(e) If G, > 0 for all n EN, there exists a finite positive Borel measure ju 
as in Part (d), and conversely. 


Poisson integrals 


3. Use notation as in Exercise 1. Let D be the open unit disc in C. 


(a) Verify that (e; + z)/(e1 — z) = 142 Dypcny en" for all z € D, where 
the series converges absolutely and uniformly in z in any compact 
subset of D. Conclude that for any complex Borel measure p on 
[a TI, 


g(2) = f 2* 2 a = wll-m nl) +20 ene", (3) 


€i — Zz 
t kEN 


where c, = f e_, dy and integration is over [—7, 7]. In particular, g 
is analytic in D, and if pis real, Rg(z) = f R((e1 + z)/(e1 — z)) dp is 


244 


(f 


Nw 


) 


8. Hilbert spaces 


(real) harmonic in D. Verify that the “kernel” in the last integral has 
the form P,(@ —t), where z = re? and 
1—r? 
P,.(@) :-= —.——_—_; 
(9) 1 — 2rcos@ 4+ r? 
is the classical Poisson kernel (for the disc). Thus 


(Rg) (re!) = (P, « w)(8) = / P,(0 — t) dut). (4) 


Let be a complex Borel measure on [—7, 7]. Then P,.* is a complex 
harmonic function in D (as a function of z = re’®). (This is true in 
particular for P, *« f, for any f € L'(m).) 


Verify that {P,;0 < r < 1} is an “approximate identity” for ZL! in 
the sense of Exercise 6, Chapter 4 (with the continuous parameter r 
instead of the discrete n). Consequently, as r — 1, 
(i) if f © L? for some p < oo, then P, « f > f in L?-norm; 

) if f € Cr, then P, * f > f uniformly in [—7, 7]; 
(iii) if f € L©, then P,. x f > f in the weak*-topology on L™; 

) if uw € Mp, then (P, * u)dm > dy in the weak*-topology. 
For pp € Mr, denote F(t) = u([—7, t)) and verify the identity 


(Px@=r f K=O Da, (5) 
where Seas 
K,(t) = (1—r?) sin“ t (6) 


(1 — 2rcost + r?)?" 


Verify that {K,;0 <r < 1} is an approximate identity for L1(m) in 
the sense of Exercise 6, Chapter 4. (Hint: integration by parts.) 


Let Go(t) denote the function integrated against K,(t) in (3). If F 
is differentiable at the point 0, Gg(-) is continuous at 0 and Gg(0) = 
F’(@). Conclude from Part (d) that P,.*«j 3 2a F" (= 20D = du/dm) 
as r — 1 at all points 6 where F is differentiable, that is, m-almost 
everywhere in [—7, 7]. (Cf. Theorem 3.28 with k = 1 and Exercise 4e, 
Chapter 3; note that here m is normalized Lebesgue measure on 
[—a,7].) This is the “radial limit” version of Fatou’s theorem on 
“Poisson integrals”. (The same conclusion is true with “non-tangential 
convergence” of re’ to points of the unit circle.) 


State and prove the analogue of Exercise 2 for the representation of 
harmonic functions in D as Poisson integrals. 


4. Poisson integrals in the right half-plane. Let C* denote the right half- 
plane, and 


= x 


Exercises 245 


(This is the so-called Poisson kernel of the right half-plane.) Prove: 


(a) {P,;z > 0} is an approximate identity for L'(R) (as x — 0+) 
(cf. Exercise 6, Chapter 4). Consequently, as « > 0+, P, * f > f 
uniformly on R if f € C.(R), and in L?-norm if f © L?(R) 
(l<p<o). 

(b) (P, * f)(y) is a harmonic function of (x,y) in CY. 


(c) If f € L?(R), then for each 6 > 0, (P,* f)(y) + 0 uniformly for 7 > 6 
as x7 + y? > oo. (Hint: use Holder’s inequality with the probability 
measure (1/7)P,(y — t)dt for x,y fixed.) 

(d) If f € Li(dt/(1+ #?)), then P, * f — f as x — 0+ pointwise a.e. 
on R. (Hint: revisit the argument in Parts (d)-(e) of the preceding 
exercise, or transform the disc onto the half-plane and use Fatou’s 
theorem for the disc.) 


5. Let ys be a complex Borel measure on [—7, 7]. Show that 


Jim e_n du =0 (7) 
iff 
Jim, f endl =0. (8) 


(Hint: (5) for the measure yz implies (5) for the measure dv = hdy for 
any trigonometric polynomial h; use a density argument and the relation 
d|u| = hd for an appropriate h.) 


Divergence of Fourier series 


6. Use notation as in Exercise 1. Consider the partial sums P,, f of the Fourier 
series of f € Cr. Let 


On(f) = (Pn f)(0) >= (Sn * f)(0) (f € Cr). (9) 


Prove: 


(a) For each n € N, ¢, is a bounded linear functional on Cr with 
norm ||s,||1 (the L'(m)-norm of s,). (Hint: ||¢n|| < ||sn||1 trivially. 
Consider real functions f; € Cr such that || fj||,, <1 and f; > sgn sy 
a.e., cf. Exercise 9, Chapter 3. Then ¢,(f;) > ||Snll1-) 

(b) lim, |/Sn|]l1 = oo. (Use the fact that the Dirichlet integral 
Jo. ((sin t)/t)dt does not converge absolutely.) 


(c) The subspace 
Z:={f € Cr; sup |n(f)| < cof 


246 


8. Hilbert spaces 


is of Baire’s first category in the Banach space Cr. Conclude that 
the subspace of C'r consisting of all f € Cr with convergent Fourier 
series at 0 is of Baire’s first category in Cy. (Hint: assume Z is of 
Baire’s second category in Cr and apply Theorem 6.4 and Parts (a) 
and (b).) 


Fourier coefficients of L' functions 


7. Use notation as in Exercise 1. 


(a) If f € L1 := L'(m), prove that 


es 


NS 


lim (f,e,) = 0. (10) 


|k| 00 


Hint: if f = e, for some n € Z, (f,e,) = 0 for all k such that |k| > |nl, 
and (8) is trivial. Hence, (8) is true for f € span{e,;n € Z}, and the 
general case follows by density of this span in L1, cf. Exercise 1, 
Part (d). 

Consider Z with the discrete topology (this is a locally compact 
Hausdorff space!), and let cp := Co(Z); cf. Definition 3.23. Consider 
the map 

F: fel’ +{(f,e.)}€o 


(cf. Part (a)). Then F € B(L1,co) is one-to-one. (Hint: if (f,ex) = 0 
for all & € Z, then (f, g) = 0 for all g € span{e;,}, hence for all g € Cr 
by Exercise 1(d), hence for g = Ig for any measurable subset EF of 
[—7, 7]; cf. Exercise 9, Chapter 3, and Proposition 1.22.) 


Prove that the range of F is of Baire’s first category in the Banach 
space Co (in particular, F' is not onto). (Hint: if the range of F is of 
Baire’s second category in co, F~' with domain range F is continuous, 
by Theorem 6.9. Therefore there exists c > 0 such that ||F'f||u > 
ellf||1 for all f € Lt. Get a contradiction by choosing f = sn; cf. 
Exercise 6(b).) 


Miscellaneous 


8. Let {a,} and {b,} be two orthonormal sequences in the Hilbert space X 
such that 


S~ Ibn — anl|? <1. 


Prove that {a,,} is a Hilbert basis for X iff this is true for {b,}. 


9. Let {a,} be a Hilbert basis for the Hilbert space X. Define T € B(X) by 
Tag = apii, k € N. Prove: 


Exercises 247 


10. 


11. 


12. 


13. 


14. 


15. 


(a) T is isometric. 


(b) T” + 0 in the weak operator topology. 


If {a,} is an orthonormal (infinite) sequence in an inner product 
space, then a, — 0 weakly (however, {a,,} has no strongly convergent 
subsequence, because ||@n — Gm||? = 2; this shows in particular that the 
closed unit ball of an infinite dimensional Hilbert space is not stongly 
compact). 


Let {x,} be a net in the inner product space X. Then t, > « € X 
strongly iff 4 > x weakly and ||xq|| > ||a|l. 


Let A be an orthonormal basis for the Hilbert space X. Prove: 


(a) If f : A— X is any map such that (f(a),a) = 0 for all a € A, then 
it does not necessarily follow that f = 0 (the zero map on A). 

(b) If T € B(X) is such that ({'x,x) = 0 for all x € X, then T = 0 (the 
zero operator). 


(c) If S,T € B(X) are such that (Tx,x) = (Sa,x) for all « € X, then 
S=T. 


Let X be a Hilbert space, and let NV be the set of normal operators in 
B(X). Prove that the adjoint operation T + T* is continuous on V in 
the s.o.t. 


Let X be a Hilbert space, and T € B(X). Denote by P(T) and Q(T) 
the orthogonal projections onto the closed subspaces kerT and TX, 
respectively. Prove: 


(a) The complementary orthogonal projections of P(T) and Q(T) are 
Q(T*) and P(T*), respectively. 


(b) P(T*T) = P(T) and Q(T*T) = Q(T"). 


For any (non-empty) set A, denote by B(A) the C*-algebra of all bounded 
complex functions on A with pointwise operations, the involution f > f 
(complex conjugation), and the supremum norm ||f||. = sup, |/]. 


Let A be an orthonormal basis of the Hilbert space X. For each 
f € B(A) and x € X, let 


Type = S- f(a)(x, a)a. 


acA 


Prove: 


(a) The map f — Ty is an isometric *isomorphism of the C*-algebra 
(A) into B(X). (In particular, Ty is a normal operator.) 


248 


16. 


17. 


18. 


19. 


20. 


21. 


8. Hilbert spaces 


(b) Ty is selfadjoint (positive, unitary) iff f is real-valued (f > 0,|f| =1, 
respectively). 


Let (S,A, 1) be a o-finite positive measure space. Consider L®(,1) as 
a C*-algebra with pointwise multiplication and complex conjugation as 
involution. Let p € [1, 00). For each f € L™() define 


Trg=f9g (g€ L?(u)). 


Prove: 


(a) The map f — Ty is an isometric isomorphism of L°(s) into 
B(L?(u)); in case p = 2, the map is an isometric *-isomorphism 
of L(y) onto a commutative C*-algebra of (normal) operators on 
L?(p). 

(b) (Case p = 2.) Ty is selfadjoint (positive, unitary) iff f is real-valued 
(f > 0,|f| = 1, respectively) almost everywhere. 


Let X be a Hilbert space. Show that multiplication in B(X) is not 
(jointly) continuous in the w.o.t., even on the norm-closed unit ball B(X), 
of B(X) in the relative w.o.t., unless X is finite dimensional; however, it 
is continuous on B(X), in the relative s.o.t. 


Use notation as in Exercise 17. Prove that B(X), is compact in the w.o.t., 
but not in the s.o.t. (unless X is finite dimensional). 


Let X,Y be Hilbert spaces, and T € B(X,Y). Imitate the definition of the 
Hilbert adjoint of an operator in B(X) to define the (Hilbert) adjoint T* € 
B(Y,X). Observe that T*T is a positive operator in B(X). Let {an}nen 
be an orthonormal basis for X and let Vx = {(x,a,)}. We know that V is 
a Hilbert space isomorphism of X onto /?. What are V*, V~!, and V*V? 


Let {@n}nen be an orthonormal basis for the Hilbert space X, and let 
Q € B(X) be invertible in B(X). Let b, = Qay, n € N. Prove that there 
exist positive constants A, B such that 


AS SK? < So Ande ll? < BSS Ac? 
for all finite sets of scalars Ax. 


Let X be a Hilbert space. A sequence {a} C X is upper (lower) Bessel 
if there exists a positive constant B(resp. A) such that 


YE lean)? < Bllz|l? (= Alla|?, resp.) 


for alla € X. The sequence is two-sided Bessel if it is both upper and lower 
Bessel (e.g., an orthonormal sequence is upper Bessel with B = 1, and is 
two-sided Bessel with A = B = 1 iff it is an orthonormal basis for X). 


Exercises 249 


(a) Let {a,} be an upper Bessel sequence in X, and define V as in 
Exercise 19. Then V € B(X,/?) and ||V|| < B‘/?. On the other 
hand, for any {\,,} € I?, the series }> An@n converges in X and its 
sum equals V*{\,,}. The operator S := V*V € B(X) is a positive 
operator with norm < B. 

(b) If {an} is two-sided Bessel, S — AI is positive, and therefore a(S) C 
[A, B]. In particular, S' is onto. Conclude that every « € X can be 
represented as = )\(x,S~tan)an (convergent in X). 


Tensor products of vector spaces 


22. 


23. 


24. 


Here we use the notation from Section 8.8.1. In particular, the symbol ® 
denotes the tensor product of vector spaces. Also, the letters U,V,W, Z 
denote vector spaces over the same field F. In addition, we write L(U, V) 
for the vector space of all linear maps from U to V. 


A bilinear map from U x V to W is a function B: U x V > W satisfying 


n 


Bl >) og, >) Ajuy | = >> > cB Blu, vy) 
i=1 j=l 


i=1 j=l 


for all nym € N, wy,...,Uun € U, v1,...,Um € V, and 
Q1,---;Qn,81,---,; Bm € F. The set of all such bilinear maps, denoted by 
L(U x V,W), becomes a vector space when endowed with the pointwise 
operations. 

Prove that there is a linear isomorphism from the space L(U x V,W) 
onto the space L(U ® V,W) mapping B in the former to T in the latter 
given by T(u ® v) = B(u,v) for all u € U and v € V (T is called the 
linearization of B). 


Prove that if T € L(U,W) and S € L(V, Z) are injective, then the map 
T@S Ee LUSV,W®Z) of Part (iv) of Proposition 8.16 is also injective. 


(a) Prove that there exists a unique linear map from L(U,W) ® L(V, Z) 
into L(UU@V,W@Z) sending T@S € L(U,W)@L(V, Z) to the element 
of L(U @ V,W ® Z) with the same notation defined by Part (iv) of 
Proposition 8.16. 


(b) Prove that this map is injective. 


Tensor products of Hilbert spaces 


Here we use the notation from Section 8.8.2. In particular, the symbols 
®ale, ® denote the tensor product of vector spaces and the tensor product 
of Hilbert spaces, respectively. Unless otherwise specified, X,Y, Z,W are 
Hilbert spaces. 


250 


25. 


26. 


27. 


28. 


8. Hilbert spaces 


(a) Let T € B(X,Y). Prove that the value in [0,00] of the positive sum 
ek |Teql|, with {€a}gea being an orthonormal base of X, is 
independent of the choice of {ea}, 4. If this sum is finite we say that 
T is a Hilbert-Schmidt operator from X to Y. Denote the set of all 
these operators by HS(X, Y). 


(b) Prove that HS(X,Y) is a linear subspace of B(X,Y). 

(c) For T,S € HS(X,Y), prove that the sum )°,¢,4 (Tea, Sea), with 
{€a}gea being an orthonormal base of X, converges, and its sum 
is independent of the choice of {ea},¢,4- Denote this number by 
(T, S)ig- 

(d) Prove that (HS(X,Y), (-,-)qg) is a Hilbert space. Explain why the 
norm ||: ||Hs induced by (-,-)4g dominates the operator norm: 
IT lacx,y) < | las for all T € HS(X, Y). 


(e) For x € X andy € Y, let 6, : X - Y be the linear operator (-, x) y, 
that is, X 3 w > (w,x)y. Prove that span {6,2;7 € X,y € Y}, 
which equals the set of finite rank operators in B(X, Y), is dense in the 
Hilbert space HS(X, Y). Compare Part (d) of Exercise 8 in Chapter 6. 


(f) The conjugate Hilbert space X of X is defined as follows. Its 
underlying set is isomorphic to that of X and its elements are 
denoted by Z, x € X. It becomes a complex vector space by letting 
e+y:=a2+yanda-%:=a-ax forz,y € X anda € C. Endowed with 
the inner product given by (Z,%) := (y,x) for x,y € X it becomes a 
Hilbert space. Note that the map X + X™* given by % > (-,2) is an 
isometric isomorphism of Banach spaces. 

Prove that the map that for €¢ X andy € Y takes T@y to Oy 2 
extends (uniquely) to an isomorphism of the Hilbert spaces X @ Y 
and HS(X,Y). 

(g) Prove that every Hilbert-Schmidt operator from X to Y is compact. 
(This part is not related to tensor products.) 


Prove that if T € B(X,W) and S € B(Y, Z) are injective, then the map 
T@S € B(X@Y,W®Z) of Part (vi) of Proposition 8.18 is also injective. 


(a) Prove that there is a unique linear map from B(X,W) @aig B(Y, Z) 
into B(X @ Y,W ® Z) sending T® S € B(X,W) @aig B(Y, Z) to 
the element of B(X ® Y,W ® Z) with the same notation defined by 
Part (vi) of Proposition 8.18. 


(b) Prove that this map is injective. 


Let (X,.A, 4) and (Y,8,v) be complete o-finite positive measure spaces. 
In this exercise we prove that L?() @ L?(v) is “equal” to L?(p x v). 

For functions f: X ~ Candg:Y —C, define fxg:XxY OC 
by (f x g)(a,y) > f(x)g(y) (@ € X, y € Y). Recall from Exercise 4 
of Chapter 2 that if f € L?(X,A,pu) and g € L7(Y,B,v), then f xg € 
TA(X x Y,A x Bye v). 


Exercises 251 


29. 


Consider the Hilbert spaces L?(u) := L?(X,A,y), L?(v) := 
L?(Y,B,v) and L?(u x v) := L?(X x Y,A x B, x v). Prove that there 
exists a unique map in B(L?(y) @ L?(v), L?(u x v)) that sends f @ g to 
f xg for all f € L?(w) and g € L?(v), and that this map is an isomorphism 
of Hilbert spaces. 

(Hint: first prove that the map exists and preserves the inner product, 
and then use the averages lemma to prove that it is surjective.) 


Let (Q,.A, 44) be a positive measure space and X be a Hilbert space. In 
this exercise we introduce the Hilbert space L?((Q,A, 1), X) and prove 
that it is “equal” to L7(0, A, pu) @ X. 


A function f :Q—- X is called: 
¢ weakly measurable if for every x* € X*, the function 2* o f:Q > Cis 
measurable; 


¢ almost separably-valued if there exists a null set EF € A such that 
f(Q\E) is separable in X; 


strongly measurable if it is weakly measurable and almost separably- 
valued. 


(a) Prove that if f :Q— X is strongly measurable, then the scalar-valued 
function || f(-)|| : Q — [0,00) (given by w > ||f(w)||) is measurable. 

(b) Let L£?7((Q,A,u),X) be the set of all equivalence classes (under 
equality a.e.) of strongly measurable functions f :Q — X satisfying 


Jo lIFW)I? du) < oo. Prove that L?((0, A, 1), X) is a Hilbert space 
with respect to the pointwise operations and the inner product 


(f.9) = a (fw), 9(w))x dulw) — (f,g € L°((Q, A, x), X)) 


(the fact that this integral exists also requires proof!). 
(Hint: mimic the proof of Theorem 1.29.) 


(c) For f:Q > Cand «€ X, let fa: Q— X be given by w > f(w)z. 
Prove that if f € L?(Q,A,) then fx € L?((Q, A, pu), X). 


(d) Prove that there exists a unique map in B(L?7(Q,A,u) @ 
X, L7((Q,A, pu), X)) that sends f @a € L7?(0,A,u) @ X to fax € 
L?((Q, A, ), X) for every f € L?(Q,A,) and x € X, and that this 
map is an isomorphism of Hilbert spaces. 


Remark that for two positive measure spaces as in Exercise 28, 
Exercises 28 and 29 provide two different “realizations” of the tensor 
product of their L?-spaces. 


9 


Integral representation 


A well-known elementary fact about any selfadjoint matrix T with complex 
entries is that it is unitarily equivalent to a diagonal matrix, with the eigenvalues 
of T as its diagonal entries. Such a diagonal matrix is clearly the linear 
combination )* \;£;, where A; are the eigenvalues and £; are diagonal matrices 
with 1 at one spot of the diagonal and 0 elsewhere. If Q is the unitary matrix 
of the equivalence, then the given selfadjoint matrix T' is the linear combination 
>OAF;, where F; = QE;Q* are selfadjoint projections, just like E;. Shifting to 
the corresponding operators, we get a version that suggests a straightforward 
generalization to infinite-dimensional complex Hilbert spaces, with the sum 
“representation” replaced by an integral representation over the spectrum of 
T. This chapter is mostly concerned with such representations from a rather 
general point of view. After a short discussion of spectral measures on a “Banach 
subspace” Z of a Banach space X and the corresponding spectral integral, we 
prove the classical spectral theorem for a normal operator T on a Hilbert space 
X, which establishes that every continuous function f(T) of T is the “spectral” 
integral of f with respect to a unique spectral measure on X supported by the 
spectrum of T. The spectral measure is then shown to allow a “representation” 
of f(T) as a “multiplication by f” operator on a suitable direct sum of L? 
spaces. A renorming method is used in the general Banach space setting to 
construct a maximal Banach subspace Z, called the semi-simplicity space of the 
given operator T, on which T admits an operational calculus f > f(T) with 
continuous functions f, thus generating an integral representation of f(T’) with 
respect to a spectral measure on Z when X is reflexive. 

The more restrictive analytic operational calculus for elements of an arbitrary 
Banach algebra is then defined and studied, and is used to develop the Riesz- 
Schauder theory of compact operators in arbitrary Banach spaces. 


Introduction to Modern Analysis. Second Edition. Shmuel Kantorovitz and Ami Viselter, Oxford University Press. 
© Shmuel Kantorovitz and Ami Viselter (2022). DOI: 10.1093/0s0/9780192849540.003.0009 


254 9. Integral representation 


9.1 Spectral measure on a Banach subspace 


Let X be a Banach space. A Banach subspace Z of X is a subspace of X in the 
algebraic sense, which is a Banach space for a norm || - ||z larger than or equal 
to the given norm || - || of X. Clearly, if Z and X are Banach subspaces of each 
other, then they coincide as Banach spaces (with equality of norms). 

Let K be a compact subset of the complex plane C, and let C(K) be the 
Banach algebra of all complex continuous functions on K with the supremum 
norm |) fl := sup [fl 

Let T € B(X), and suppose Z is a T-invariant Banach subspace of X, such 
that T|z, the restriction of T to Z, belongs to B(Z). 

A (contractive) C(K)-operational calculus for T on Z is a (contractive) C(K)- 
operational calculus for T\z in B(Z), that is, a (norm-decreasing) continuous 
algebra-homomorphism 7 : C(K) + B(Z) such that 7(fo) = I|z and r(fi) = 
T\z, where f,(A) = A*,k = 0,1. When such 7 exists, we say that T is of 
(contractive) class C(K) on Z (or that T|z is of (contractive) class C(K)). 

If the complex number £ is not in K, the function gg(A) = (8— A)~! belongs 
to C(x) and (6 — A)gg(A) = 1 on K. Since 7 is an algebra homomorphism of 
C(K) into B(Z), it follows that (8I — T)|z has the inverse r(gg) in B(Z). In 
particular, op(z)(T|z) C K. 

Let f be any rational function with poles off kK. Write 


FO) =e] [ax — A)(6; - A), 


where a € C, a, are the zeroes of f and (; are its poles (reduced decomposition). 
Since 7 is an algebra homomorphism, 7(f) is uniquely determined as 


7(f) =o] [(ont — T)|z(6)2 - T)|z". (0) 


kj 


If K has planar Lebesgue measure zero, the rational functions with poles off 
K are dense in C(K), by the Hartogs-Rosenthal theorem (cf. Exercise 2). The 
continuity of 7 implies then that the C(i‘)-operational calculus for T on Z is 
unique (when it exists). The operator 7(f) € B(Z) is usually denoted by f(T|z), 
for f ¢ C(K). 

A spectral measure on Z is a map E of the Borel algebra B(C) of C into 
B(Z), such that: 

(1) For each a € Z and 2* € X*, 2* E(-)a is a regular complex Borel measure; 

and 

(2) E(C) =I|z, and E(6N.€) = E(6)E(e) for all 6,e € B(C). 

The spectral measure E is contractive if ||E(6)||B(z) < 1 for all 6 € B(C). 

By Property (2), E(6) is a projection in Z; therefore a contractive spectral 
measure satisfies ||/(d)||g(z) = 1 whenever E(6) 4 0. The closed subspaces 
E(6)Z of Z satisfy the relation 


E()ZNE(0)Z = E(6Ne)Z 


9.2. Integration 255 


for all 6,¢ € B(C). In particular, E(6)ZN E(e)Z = {0} if 6N« = 0. Therefore, 
for any partition {6,;k € N} of C,Z has the direct sum decomposition Z = 
>), PE(On)Z (cf. Property (2) of E). This is equivalent to the decomposition 
I|z = >0, E(6x), with projections E(é,) € B(Z) such that E(d,)E(6;) = 0 for 
k #7. For that reason, FE is also called a resolution of the identity on Z. 

Property (1) of E together with Theorem 1.43 and Corollary 6.7 imply that 
E(-)az is a Z-valued additive set function and 


M, := sup ||E(d)a|| < oo (1) 
d€B(C) 


for each x € Z. 


9.2 Integration 


If js is a real Borel measure on C (to fix the ideas), and {6,} is a partition of C, 
let J = {k; u(d,) > O}. Then 


2 lax) = Dade) - S> H(5x) 


keJ keN-J 


-o( Js)-o( U8) = 


keJ keEN—-J 


where M := supsegcc) |H(9)|(< ||z||, the total variation norm of 1). 

If is a complex Borel measure, apply the shown inequality to the real Borel 
measures Ry and Sy to conclude that >, |u(dx)| < 4M for all partitions, hence 
lll < 4M. 

By (1) of Section 9.1, for each «* € X* and x € Z, 


sup |e*E(d)z| < Malla"). 
5€B(C) 


Therefore 
|z* E(-)a|| < 4Mz||2* ||. (1) 


The Banach algebra of all bounded complex Borel functions on C or R is 
denoted by B(C) or B(R), respectively (briefly, B); Bo(C) and Bo(R) (briefly Bo) 
are the respective dense subalgebras of simple Borel functions. The norm on B 
is the supremum norm (denoted || - ||). 

Integration with respect to the vector measure E(-)« is defined on simple 
Borel functions as in Definition 1.12. It follows from (1) that for f € Bo, x € Z, 
and «* € X*, 


of fazcye 


es 


=I ff fda" E(-)a| < ||flllla*EC)al| < 4Mellfilile" |. 


Therefore 


| [ f dE(.)2| < 4Mall ll, 


256 9. Integral representation 


and we conclude that the map f — jf, fdE(-)a is a continuous linear map 
from Bo to X with norm <4M,. It extends uniquely (by density of Bo in B) 
to a continuous linear map (same notation!) of B into X, with the same norm 
(<4M,). 

It follows clearly from the definition that the vector {, f dE(-)x (belonging 
to X, and not necessarily in Z!) satisfies the relation 


a 


of faces f fac" EQ)2 (2) 


for all f € B(C),a# € Z, and x* € X*. 

As for scalar measures, the support of E, supp E, is the complement in C of 
the union of all open set 6 such that E(d) = 0. The support of each complex 
measure x* E(-)x is then contained in the support of E. One has 


i faB()c= [. fazoe 


for all f € B and x € Z (where as usual the right-hand side is defined as the 
integral over C of fXsuppz, and xy denotes in this chapter the indicator of 
V CC) (cf. (2) of Section 3.5). The right-hand side of the last equation can be 
used to extend the definition of the integral to complex Borel function that are 
only bounded on supp E. 

In case Z = X, it follows from (1) of Section 9.1 and the uniform boundedness 
theorem that M := supsegc) || E(9)|| < 00, Mz < M||z||, and (1) takes the form 
|lc* E(-)a|| < 4M ||a||\lv*|| for all « © X and x* € X™*. It then follows that the 
spectral integral Te f dE, defined by 


([ tae) es f faze (fe 


belongs to B(X) and has operator norm <4M|| f||. 

If X is a Hilbert space and Z = X, one may use the Riesz representation for 
X* to express Property (1) of E and Relation (2) in the following form: 

For each x,y € X,(E(-)x,y) is a regular complex measure. 

For each x,y € X and f € B(C), 


( [ f dB (2.9) = [ sace(e.y). (4) 


In the Hilbert space context, it is particularly significant to consider selfadjoint 
spectral measures on X, E(-), that is, E(d)* = E(6) for all 6 € B(C). The 
operators £(d) are then orthogonal projections, and any partition {6,;k € N} 
gives an orthogonal decomposition X = >>, ®E(d,)X into mutually orthogonal 
closed subspaces (by Property (2) of E(-)). Equivalently, the identity operator is 
the sum of the mutually orthogonal projections E'(6,) (the adjective “orthogonal” 
is transferred from the subspace to the corresponding projection). For this reason, 
E is also called a (selfadjoint) resolution of the identity. 


SS) 
NH 


(3) 


9.3. Case Z = X 257 


9.3 Case Z7=X 


The relationship between C(K)-operational calculi and spectral measures is 
especially simple in case Z = X. 


Theorem 9.1. 


(1) Let E be a spectral measure on the Banach space X, supported by the 
compact. subset K of C, and let r(f) be the associated spectral integral, for 
fe (C). Thenr :B— B(X) is a continuous representation of B on 
xX ve norm <4M. The restriction of t to (Borel) functions continuous 
on K defines a C(K)-operational calculus for T := T(f1) = J. AdE(A). 

If X is a Hilbert space and the spectral measure E is selfadjoint, then 
T is a norm-decreasing *-representation of B, and T\c(K) is a contractive 
C(K)-operational calculus on X for T (sending adjoints to adjoints). 


(2) Conversely, Let rT be a C(K)-operational calculus for a given operator T on 
the reflexive Banach space X. Then there exists a unique spectral measure 
E commuting with T with support in K, such that t(f) = J f dE for all 
f € C(K) (and the spectral integral on the right-hand side extends T to a 
continuous representation of B on X ). 
If X is a Hilbert space and tT sends adjoints to adjoints, then E is a 
contractive selfadjoint spectral measure on X. 


Proof. 
(1) A calculation shows that Property (2) of E implies the multiplicativity 
of 7 on Bo. Since 7 : B — B(X) was shown to be linear and continuous (with 


norm at most 4M), it follows from the density of Bo in B that 7 is a continuous 
algebra homomorphism of B into B(X) and r(1) = E(C) =I. 

If X is a Hilbert space and F is selfadjoint, then 7 restricted to Bo sends 
adjoints to adjoints, because if f = STagzx5,, then tT(f) = SLaRE (dp) = 
[> ax E(dx)|* = 7(f)*. By continuity of 7 and of the involutions, and by density 
of Bo in B, r(f) = r(f)* for all f €B. 

Let {5,} be a partition of C. For any « € X, the sequence {E(d,)x} is 
orthogonal with sum equal to x. Therefore 


Yo NE(:)al|? = [Ie (1) 
k 


By Schwarz’s inequality for X and for 1?, we have for all 2,y € X (since E(d,) 
are selfadjoint projections): 


LIME (Ox)@, y)| aN? (04 )x, E(dn)y)| < S- Ex) 2|I |Z (xy 
k 


1/2 1/2 
= (= 26" (x: \et60" = [lellllyll- 
k k 


258 9. Integral representation 


Hence 
EZO)2, yl < Ilellilyll (ey € X). (2) 


Therefore, 


\(r(A)e,y)| = Lia a) < [Lilli 


that is, ||7(f)|| < |||] for all f € B. 

(2) Let +t be a C(K)- are calculus on X for T € B(X). For each 
x € X and x* € X*, x*r(-)x is a continuous linear functional on C(K) with 
norm <||r7||||z||{|y|] (where ||7|| denotes the norm of the bounded linear map 
T:C(K) > B(X)). By the Riesz representation theorem, there exists a unique 
regular complex Borel measure pz = pu(-;z,2*) on B(C), with support in K, such 
that 


fie = I fapl-;0,0") (3) 


and 
]eGs 2, @*) |] < [Ir [IIa lle" (4) 


for all f € C(K),a € X, and a* € X*. 

For each fixed 6 € B(C) and x € X, it follows from the uniqueness of the 
Riesz representation, the linearity of the left-hand side of (3) with respect to 2*, 
and (4), that the map p(d;x,-) is a continuous linear functional on X*, with 
norm <||7]|||a||. Since X is assumed reflexive, there exists a unique vector in X, 
which we denote E(4)x, such that 


w(d; 2,2") = «* B(d)a. 


A routine calculation using the uniqueness properties mentioned and the linearity 
of the left-hand side of (3) with respect to x, shows that E(d) : X — X is linear. 
By (4), E(6) is bounded with operator norm <\jr||. By definition, EF satisfies 
Property (1) of spectral measures on X, and has support in K. Therefore, the 
integral Je f dE makes sense (see mentioned construction) for any Borel function 
f :C—C bounded on K, and (by (2) of Section 9.2 and (3)) 


r(f) = | faB (f €O(K)). (5) 
Cc 
For all f,g € C(K) and we X, 


- fdE(-)r(g)x = r(f)r(g)a = r(g)r (fle = [| fdr(g)E()« 


K 


=r(fa)e= f fade ye 


The uniqueness of the Riesz representation implies that E(-)T(g) = T(g)E(-) (in 
particular, E commutes with 7(f,) = 7) and dE(-)r(g)x = g dE(-)x. Therefore, 
for all 6 € B(C) and g € C(K), 


9.3. Case Z = X 259 


I gdE(.)E(6)x = r(g)E(8)x = E(6)r(g)e 


-| xs dE()r(g)e = | x5g dE(-)x. (6) 
K K 


By uniqueness of the Riesz representation, we get dE(-)E(0)x = x5 dE(-)x. Thus, 
for all e,d € B(C) and we X, 


B()E()e= f xeAB(\B)2 =f xexadB(a 
=} Vens dE(-)a = E(end)a. 
kK 


Taking f = fo(= 1) in (5), we get E(C) = r(fo) = J, so that EF satisfies 
Property (2) of spectral measures. Relation (5) provides the wanted relation 
between the operational calculus and the spectral measure E.. The uniqueness of 
E follows from Property (1) of spectral measures and the uniqueness of the Riesz 
representation. Finally, if X is a Hilbert space and 7 sends adjoints to adjoints, 
then by (4) of Section 9.2 


i. fd(E()x,y) = (r(f)e,y) = (es 7(f)*y) 
= («,7(fy) = Cua) =f Fateeu.2) Fae 


= | £a€EOn2) kacun 


for all f € C(K) and z,y € X. Therefore (by uniqueness of the Riesz 
representation), (E(d)r,y) = (x, E(d)y) for all 6, that is, E is a selfadjoint 
spectral measure. By (2), it is necessarily contractive. 


Terminology 9.2. Given a spectral measure E on X with compact support, 
the bounded operator T := f,AdE(A) is the associated scalar operator. By 
Theorem 9.1, T is of class C(K) on X for any compact set K containing the 
support of F, and tr: f > Jef dz (f € C(K)) is a C(K)-operational calculus 
for T. Conversely, if X is reflexive, then any operator of class C(K), for a given 
compact set K, is a scalar operator (the scalar operator associated with the 
spectral measure E in Theorem 9.1, Part 2). 

If EF and F are spectral measures on X with support in the compact set 
K, and their associated scalar operators coincide, it follows from Theorem 9.1 
(Part 1) that their associated spectral integrals coincide for all rational functions 
with poles off kK. In case kK has planar Lebesgue measure zero, these rational 
functions are dense in C(K) (by the Hartogs—Rosenthal theorem), and the 
continuity of the spectral integrals on C'(’) (cf. Theorem 9.1) implies that they 
coincide on C(k). It then follows that E = F, by the uniqueness of the Riesz 
representation. The uniqueness of the spectral measure associated with a scalar 


260 9. Integral representation 


operator can be proved without the “planar measure zero” condition on K, but 
this will not be done here. The unique spectral measure with compact support 
associated with the scalar operator T is called the resolution of the identity for T. 

For each 6 € B(C), the projection E(6) commutes with T (cf. Theorem 9.1, 
Part 2), and therefore the closed subspace E(6)X reduces T. If uw is a complex 
number not in the closure 6 of 6, the function h(A) := ys(A)/(u — A) belongs 
to B (||Al| < 1/dist(u, 6)) and (4 — A)A(A) = x6(A). Applying the B-operational 
calculus, we get (uJ — T)r(h) = E(6). Restricting to E(6)X, this means that 
Le € p(T |n(5)x). Hence 


o(T| a(5)x) C 6 (6€B(C)). (7) 


Remark 9.3. A bounded operator T for which there exists a spectral measure 
E on X, commuting with T and with support in o(T), such that (7) is satisfied, 
is called a spectral operator. It turns out that FE is uniquely determined; it is 
called the resolution of the identity for T, as before. If S is the scalar operator 
associated with E, it can be proved that T has the (unique) Jordan decomposition 
T=S$+N with N quasi-nilpotent commuting with S. Conversely, any operator 
with such a Jordan decomposition is spectral. The Jordan canonical form for 
complex matrices establishes the fact that every linear operator on C” is spectral 
(for any finite n). 


Combining 7.19 and Theorem 9.1, we obtain the following. 


9.4 The spectral theorem for normal operators 


Theorem 9.4. Let T be a normal operator on the Hilbert space X, and let 7 : 
f — f(T) be its C(o(L))-operational calculus (cf. Terminology 7.19). Then there 
exists a unique selfadjoint spectral measure E on X, commuting with T and with 
support in o(T), such that 


A) fdE 
o(T) 


for all f € C(o(L)). The spectral integral above extends the C(o(T))-operational 
calculus to a norm-decreasing *-representation tT: f > f(T) of B(C) on X. 


Remark 9.5. We can use Terminology 9.2 to restate Theorem 9.4 in the 
form: normal operators are scalar, and their resolutions of the identity are 
selfadjoint, with support in the spectrum. The converse is also true, since the 
map 7 associated with a selfadjoint spectral measure with compact support is 
a *-representation. If we consider spectral measures that are not necessarily 
selfadjoint, it can be shown that a bounded operator T in Hilbert space is scalar if 
and only if it is similar to a normal operator, that is, iff there exists a non-singular 
Q € B(X) such that QTQ~* is normal. 


Theorem 9.6. Let T be a normal operator on the Hilbert space X, and let E 
be its resolution of the identity. Then 


9.4. The spectral theorem for normal operators 261 


(1) For each x € X, the measure (E(-)x,x) = ||E(-)2||? is a positive regular 
Borel measure bounded by ||a||?, and 


lye? = | fh aecn.a) 


for all Borel functions f bounded on o(T). 
(2) If {5} ts a sequence of mutually disjoint Borel subsets of C with union 6, 


then for alla € X, 
(6)a = > E(6,)a 
k 
where the series converges strongly in X (this is “strong o-additivity” 
of EF). 


(3) If {fn} is a sequence of Borel functions, uniformly bounded on o(T), such 
that fn — f pointwise on o(T), then f,(T) > f(L) in the s.0.t. 


(4) suppE = o(T). 
Proof. (1) Since E(6) is a self-adjoint projection for each 6 € B(C), we have 
(E(6)2, x) = (E(5)?x, x) = (E(5)x, E(5)x) = ||E(5)2||? < |2\I?, 


and the stated properties of the measure follow from Property (1) of spectral 
measures in Hilbert space. 
If f is a Borel function bounded on o(T), we have for all « € X 


IF(Dall? = (f(D), f(D)x) = (A(L)*F (Dx, 2) = ((FA(D)x,2) 
=f (sR aLeQale. 
o(T) 


(2) The sequence {E£(6,)x} is orthogonal. Therefore, for each n € N, 
DEC el? = || S5 EG. )el? = |E@)al|? < lal? 
k=1 k=1 


This shows that the series >, || (6, )a||? converges. Therefore 


n 


| S> £6. )a\)? = 2 || E(dx)al|? + 0 


k=m k=m 


when m,n — oo. By completeness of X, it follows that )7, E(d,)x converges 
strongly in X. Hence, for all y € X, Property (1) of spectral measures gives 


k 


and Part (2) follows. 


262 9. Integral representation 
(3) For all a € X, we have by Part (1) 


IIfn(D)a — f(L) el? = Ifa - f(D) all? = / ' fn — FP? a B(-)2|I?, 


o( 


and Part (3) then follows from Lebesgue’s dominated convergence theorem for 
the finite positive measure || E(-)z\|?. 

(4) We already know that suppE C o(T). So we need to prove that 
(supp E)° C p(T). Let uw € (supp E)°. By definition of the support, there exists 
an open neighborhood 6 of such that E(6) = 0. Then r := dist(y,d°) > 0. 
Let g(A) := xse(A)/(u — A). Then g € (in fact, ||gl] = 1/r < ov), 
and since (uw — A)g(A) = xse(A), the operational calculus for T implies that 
(ul — T)9(T) = o(T) (ul — T) = E(C) — E(6) =I. Hence ps € p(T). 


Note that the proof of Part (4) of Theorem 9.6 is valid for any scalar operator 
in Banach space. 


9.5 Parts of the spectrum 


Definition 9.7. Let X be a Banach space, and T € B(X). 


(1) The set of all A € C such that AJ — T is not injective is called the point 
spectrum of T, and is denoted by o,(T); any A € o,(T) is called an 
eigenvalue of T, and the non-zero subspace ker(AI—T) is the corresponding 
eigenspace. The non-zero vectors x in the eigenspace are the eigenvectors 
of T corresponding to the eigenvalue (briefly, the A-eigenvectors of T): 
they are the non-trivial solutions of the equation Tx = Ax. 


The set of all A € C such that AI — T is injective, and Al — T has range 
dense but not equal to X, is called the continuous spectrum of T, and is 
denoted by o,(T). 


(3) The set of all X € C such that AJ —T is injective and AI —T has range not 
dense in X is called the residual spectrum of T, and is denoted by o,(T)). 


FON. 
i) 
YS 


Clearly, the three sets defined here are mutually disjoint subsets of o(T). 
If A is not in their union, then AI — T is bijective, hence invertible in B(X) 
(cf. Corollary 6.11). This shows that 


o(T) =0,(T) Va.(L) Uo, (T). 
Theorem 9.8. Let X be a Hilbert space and T € B(X). Then: 
(1) If\ €o,(T), then A € op(T*). 
(2) X€ op(T) Va,(T) ff \ € op(T*) Uo, (T"). 


(3) If T is normal, then » € o,(T) iff € o,(I*), and in that case the 
A-eigenspace of T coincides with the A-eigenspace of T*. 


(4) If T is normal, then o,(T) = 0. 


9.5. Parts of the spectrum 263 


(5) If T is normal and E is its resolution of the identity, then 


op(Z’) = {u € C; E({u}) A O} 
and the range of E({}) coincides with the y-eigenspace of T, for each 
uu €o,(T). 
Proof. (1) Let X € o,(T). Since the range of AJ — T is not dense in X, the 


orthogonal decomposition theorem (Theorem 1.36) implies the existence of y 4 0 
orthogonal to this range. Hence, for all x € X 


(x, [MI — T*]y) = (AI — T)a,y) = 0, 


and therefore y € ker(AI — T*) and 4 € o,(T*). 

(2) Let A € o,(T) and let then x be an eigenvector of T corresponding to 
d. Then for all y € X,(x,(AI — T*)y) = ((AI — T)a,y) = 0, which implies 
that the range of AI — T* is not dense in X (because x 4 0 is orthogonal to 
it). Therefore A belongs to o,(T*) or to op(T*) if AJ — T* is injective or not, 
respectively. Together with Part (1), this shows that if A € op(T) Uo,(T), then 
d € op(T*) Uo;,(T*). Applying this to \ and T*, we get the reverse implication 
(because T** = T). 

(3) For any normal operator S, ||S*x|| = ||Sa|| (for all 7) because 


\|S*a|? = (S*x, S*x) = (SS*x,x) = (S*Sx, x) = (Sx, Sx) = ||Sx|?. 


If T is normal, so is S := \J —T, and therefore ||(AZJ — T*)2|| = ||(AI — T)2|| (for 
all x € X and A € C). This implies Part (3). 

(4) If A € o,(T), then Part (1) implies that \ € o,(T*), and therefore, by 
Part (3), \ € op(T), a contradiction. Hence o,(T) = 0. 

(5) If E({u}) 4 0, let « £0 be in the range of this projection. Then E(-)a = 
E(.)E({u})a = E(- 7 {y})a is the point mass measure at fs (with total mass 1) 
multiplied by x. Hence, 


Tz = / AdE(A)x = pa, 
o(T) 


so that 1 € op(T) and each « £0 in E({y})X is a p-eigenvector for T. 

On the other hand, if « € o,(T) and z is a preigenvector for T, let 6, = 
{A €C;|A— pl > 1/n}(n = 1,2,...) and fr(A) := xo, (A)/(u — A). Then f, € B 
(it is clearly Borel, and bounded by n), and therefore f,(T) € B(X). Since 
(14 — A) fn(A) = xzo,, (A), we have 


B(5n)e = fu(T)(ul — T)e = fu(T)0 = 0. 


Hence, by o-subadditivity of the positive measure ||E(-)z||?, 
|EQASA 4 ph)? = (Uda Jal <7 ||E(n)2\)? = 0. 


Therefore, E({A;A 4 })a = 0 and so E({u})a = Ix = a, that is, the non-zero 
vector x belongs to the range of the projection E({y}). 


264 9. Integral representation 


9.6 Spectral representation 


Construction 9.9. The spectral theorem for normal operators has an interesting 
interpretation through isomorphisms of Hilbert spaces. Let T’ be a (bounded) 
normal operator on the Hilbert space X, and let E be its resolution of the 
identity. Given x € X, define the cycle of x (relative to T) as the closed linear 
span of the vectors T"(T*)™az,n,m = 0,1,2,.... This is clearly a reducing 
subspace for T. It follows from the Stone—Weierstrass theorem that the cycle 
of x, [a], coincides with the closure (in X) of the subspace 


[z]o = {g(T)2; 9 € C(a(T))}. 
Define Vo : [2]p > C(o(T)) by 
Vog(T)a = g.- 


By Theorem 9.6, Part (1), the map Vo is a well-defined linear isometry of [x]o 
onto the subspace C(a(T)) of L?(u), where wp = py := ||E(-)2||? is a regular 
finite positive Borel measure with support in o(T'). By Corollary 3.21, C(a(T)) 
is dense in L?(j), and therefore Vo extends uniquely as a linear isometry V of 
the cycle [z] onto L?(j). If g is a Borel function bounded on o(T), we may 
apply Theorem 3.20 to get a sequence gn € C(a(T)), uniformly bounded on 
the spectrum by sup,,7y |g, such that gn, — g pointwise p-almost everywhere 
on o(T). It follows from the proof of Part (3) in Theorem 9.6 that g(T)a = 
limp 9n(T)a, hence g(T)x € [a]. Also gn + g in L?() by dominated convergence, 
and therefore 


Vg(T)a = Vilim gn(T)x = lim Vogn(T)x = lim gn = g, 


where the last two limits are in the space L?(j) and equalities are between 
L?()-elements. 

For each y = g(T)x € [z]o and each Borel function f bounded on o(T), the 
function fg (restricted to o(T)) is a bounded Borel function on the spectrum, 
hence (fg)(T)xz makes sense and equals f(T)g(T)x = f(T)y. Applying V, we get 


ViT)y =V(fa)\(T)e = fg = fVg(T)x = fVy 


(in L?(y)). Let My denote the multiplication by f operator in L?(y), defined 
by Myg = fg. Since Vf(T) = MyV on the dense subspace [z]o of [x], and 
both operators are continuous, we conclude that the last relation is valid on [2]. 
We express this by saying that the isomorphism V intertwines f(T)|[.; and My 
(for all Borel functions f bounded on o(T)). Equivalently, f(T)|j.; = V~'MyV, 
that is, restricted to the cycle [2], the operators f(T) are unitarily equivalent 
(through V!) to the multiplication operators My on L(y) with p := ||E(-)2|l?. 
This is particularly interesting when T possess a cyclic vector, that is, a vector 
x such that [a] = X. Using such a cyclic vector in the shown construction, we 
conclude that the abstract operator f(T) is unitarily equivalent to the concrete 


9.7. Renorming method 265 


operator My acting in the concrete Hilbert space L*(j), through the Hilbert 
isomorphism V : X — L?(), for each Borel function f bounded on the spectrum; 
in particular, taking f(A) = fi(A) := A, we have T = V~! MV, where M := My,. 

The construction is generalized to an arbitrary normal operator T by 
considering a maximal family of non-zero mutually orthogonal cycles {[z;],7 € 
J} for T (J denotes some index set). Such a family exists by Zorn’s lemma. 

Let pj = ||E(-)x,||? and let V; : [xj] > L?(;) be the Hilbert isomorphism 
constructed earlier, for each j € J. If >}, [xj] A X, pick x # 0 orthogonal to 
the orthogonal sum; then the non-zero cycle [a] is orthogonal to all [x;] (since 
the cycles are reducing subspaces for T), contradicting the maximality of the 
family shown. Hence X = }7, &[x;]. Consider the operator 


Vi= Dy OV; 
jeJ 


operating on X = }), 6[x;]. Recall that if y = )7, Gy; is any vector in X, then 
Vy:= S- BV;Y;- 
J 


Then V is a linear isometry of X onto )), GL? (uj): 
IVyll? = 2a lVoasll = do yl? = IP. 
J 


Since each cycle reduces f(T) for all Borel function f bounded on o(T), we have 


Py = Zen) = LOf¥ius = MyVy, 


where My is now defined as the orthogonal sum My := >), @M}, with M} 
equal to the multiplication by f operator on the space L?(y;). The Hilbert 
isomorphism V is usually referred to as a spectral representation of X (relative 
to T). Formally: 


Theorem 9.10. Let T be a bounded normal operator on the Hilbert space X, 
and let E be its resolution of the identity. Then there exists an isomorphism V 
of X onto ier OL (us) with pj; := \|E(-)x;\|? for suitable non-zero mutually 
orthogonal vectors x;, such that f(T) = V~'My,V for all Borel functions f 
bounded on o(T). The operator My acts on S>@L?(u;) by My O95 = 
> @fg;. In case T has a cyclic vector x, then there exists an isomorphism 
V of X onto L?(p) with w:= ||E(-)al|?, such that f(T) =V~!MyV for all f as 
described; here My is the ordinary multiplication by f operator on L? (1). 


9.7 Renorming method 


In the following, X denotes a given Banach space, and B(X) is the Banach 
algebra of all bounded linear operators on X. The identity operator is denoted 


266 9. Integral representation 


by I. If A c B(X), the commutant A’ of A consists of all S € B(X) that 
commute with every T € A. 

Given an operator JT, we shall construct a maximal Banach subspace Z of X 
such that T has a contractive C(k)-operational calculus on Z, where K is 
an adequate compact subset of C containing the spectrum of T. In case X is 
reflexive, this construction will associate with T a contractive spectral measure 
on Z such that f(Z|z) is the corresponding spectral integral for each f € C(K). 
This “maximal” spectral integral representation is a generalization of the spectral 
theorem for normal operators in Hilbert space. The construction is based on the 
following. 


Theorem 9.11 (Renorming theorem). Let AC B(X) be such that its strong 
closure contains I. Let 


I|z||_4 := sup ||Tz|| (« € X), 
TEA 


and 
Z=Z(A):= {a € X3||x||_4 < co}. 


Then: 


(i) Z with the norm ||- \|4 is a Banach subspace of X. 

(ii) For any S € A’, SZC Z and $|z € B(Z) with ||S|z||B¢z) < ||S\l- 

(iti) If A is a multiplicative semigroup (for operator multiplication), then Z 
is A-invariant, and A|z := {T|z;T € A} is contained in the closed unit 
ball By(Z) of B(Z). Moreover, Z is maximal with this property, that is, 
if W is an A-invariant Banach subspace of X such that Alw C By(W), 
then W is a Banach subspace of Z. 


Proof. Subadditivity and homogeneity of || - ||, are clear. 
Let € > 0. The neighborhood of I 


N := N(,¢,2) := {T € B(X);||(T — Da|| < e} 


(in the strong operator topology) meets A for each x € X. Fix «, and pick then 
TE ANN. Then 


llzl.4 > [Tal] = lla + - Dall > lall—e. 


Therefore, ||x||_4 > ||a|| for all 2 € X, and it follows in particular that |] - ||, is a 
norm on A. 

Let {a} C Z be || - || 4-Cauchy. In particular, it is || - || ,-bounded; let then 
K = sup, ||@n||_a. Since || - || < || - ||4, the sequence is || - ||-Cauchy; let x be its 
X-limit. Given € > 0, let np € N be such that 


|tn —ImllA<€ (n,m > n0). 


Then 
||T2,—T2rm||<e€ (n,m >no;T € A). 


9.8. Semi-simplicity space 267 


Letting m — ov, we see that 
||T2,—Ta||<e€ (n>n0;T € A). 


Hence 
Zn — alla <€ (n> 9). (1) 


Fixing n > no, we see that 
[tla < lenlla + lle -— tnlla SK +€ <0, 


that is, 7 € Z, and rp», > x in (Z,|| - ||_4) by (1). This proves Part (i). 
If S € A’, then for all2 € Z and TE A, 


|P(Szx)|| = ||S(Lx)|| < |S|I||Lal] < ||Slllella- 


Therefore, ||Sz||.4 < ||S|||la|| 4 < oo, that is, SZ Cc Z and S|z € B(Z) with 
IS|z|le¢z) < WISI 
In case A is a multiplicative sub-semigroup of B(X), we have for all 7 € Z 
andT,UEA 
|U(Tx)|| = |(UT)2|| < sup ||Va'l| = [lal a. 
VEA 


Hence, ||T2||4 < ||a||_4, so that Z is A-invariant and Alz C By (Z). 
Finally, if W is as stated in the theorem, and x € W, then for all T € A, 


[Tal] < ||Tellw < IT leayllellw < llellw, 


and therefore ||z||_4 < ||z|lw. 


9.8 Semi-simplicity space 


Let A C R be compact, and let P(A) denote the set of all complex polynomials p 
with 
Ilplla := sup |p| < 1. 
A 


Given an arbitrary bounded operator T, let A be the (multiplicative) semigroup 
A := {p(T);p € Pi(A)}, 


and let Z = Z(A). 
By Theorem 9.11, Part (iii), p(T)Z C Z and 


IIp(T)|zllacz) <1 


for all p € Pi(A). This means that the polynomial operational calculus 7 : 
p > p(T)|z(= p(L|z)) is norm-decreasing as a homomorphism of P(A), the 
subalgebra of C(A) of all (complex) polynomials restricted to A, into B(Z). Since 
P(A) is dense in C(A), 7 extends uniquely as a norm decreasing homomorphism 


268 9. Integral representation 


of the algebra C(A) into B(Z), that is, T is of contractive class C(A) on Z. The 
Banach subspace Z in this application of the renorming theorem is called the 
semi-simplicity space for T. 

On the other hand, suppose W is a T-invariant Banach subspace of X, such 
that T is of contractive class C(A) on W. Then for each p € P(A) and w € W, 


IIp(T)wl| < |lpP(P)lwllampyllellw < llwllw, 


and therefore w € Z and ||w||4 < ||w|lw. This shows that W is a Banach 
subspace of Z, and concludes the proof of the following. 


Theorem 9.12. Let A C R be compact, T € B(X), and let Z be the semi- 
simplicity space for T, that is, Z = Z(A), where A is the (multiplicative) 
semigroup 


{p(T);p € Pi(A)}. 


Then T is of contractive class C(A) on Z. 

Moreover, Z is maximal in the following sense: ifW is a T-invariant Banach 
subspace of X such that T is of contractive class C(A) on W, then W is a Banach 
subspace of Z. 


Remark 9.13. (1) If X is a Hilbert space and T is a bounded selfadjoint 
operator on X, it follows from Remark (2) in Terminology 7.19 that ||p(T)|| = 
lPll-cr) < 1 for all p € P,(A) for any given compact subset A of R containing 
o(T). Therefore ||z||_4 < ||z|| for all « € X. Hence Z = X (with equality of 
norms) in this case. 

Another way to see this is by observing that the selfadjoint operator T has a 
contractive C'(A)-operational calculus on X (see Terminology 7.19, Remark (2)). 
Therefore, X is a Banach subspace of Z by the maximality statement in 
Theorem 9.12. Hence Z = X. 

(2) If T is a normal operator (with spectrum in a compact set A Cc C), we 
may take 


A = {p(T,T*);p € Pi(A)}, 


where P; (A) is now the set of all complex polynomials p in \ and X with ||p|| a := 
SUP)\EcA Ip, )| <1. 

Then A is a commutative semigroup (for operator multiplication), and since 
polynomials in \ and \ are dense in C(A) (by Theorem 5.39), we conclude as 
before that Z := Z(A) coincides with X (with equality of norms). 

(3) In the general Banach space setting, we took A C R in the construction 
leading to Theorem 9.12 to ensure the density of P(A) in C(A). If A is any 
compact subset of C, Lavrentiev’s theorem (cf. Theorem 8.7 in Gamelin, 1969) 
states that the wanted density occurs if and only if A is nowhere dense and has 
connected complement. Theorem 9.12 is then valid for such A, with the same 
statement and proof. 

(4) When the complement of A is not connected, the choice of A may be 
adapted in some cases so that Theorem 9.12 remains valid. 


9.8. Semi-simplicity space 269 


An important case of this kind is the unit circle: 
A=T:={AEC;|A| = 1}. 


Suppose T € B(X) has spectrum in I. Consider the algebra R(T) of restrictions 
to T of all complex polynomials in \ and \~!(= \ on T). Following our previous 
notation, let 


Ri(L) = {p € R(C)s ple = sup |p| < i}. 


Since o(T) CT,T is invertible in B(X). Let p € R(L). Writing p as the finite 
sum 


P(A) = So and* (ax € Ck € Z), 
k 


it makes sense to define 


p(T) = Ss apt”. 
k 


The map 7 : p > p(T) is the unique algebra homomorphism of R(T) into 
B(X) such that T(po) = I and t(pi1) = T, where px(A) = A*. As before, we 
choose A to be the (multiplicative) semigroup 


A= {p(T);p € Ri (l)}, 


and we call Z = Z(A) the semi-simplicity space for T. 
By Theorem 9.11, Part (iii), Z is a T-invariant Banach subspace of X, and 


lp(D)\zllaz) <1) (pe Rill). (1) 


By Theorem 5.39, R(T) is dense in C(I). Consequently, it follows from (1) 
that 


tz: p— p(T)\z(= p(T|z)) 


extends uniquely to a norm-decreasing algebra homomorphism of C(I) into 
B(Z), that is, T is of contractive class C(T) on Z. 

The maximality of Z is proved word for word as in Theorem 9.12. 

We restate Theorem 9.12 in the present case for future reference. 


Theorem 9.14. Let T € B(X) have spectrum on the unit circle T. Let Z be its 
semi-simplicity space, as defined in Remark 9.138 (4). Then Z is a T-invariant 
Banach subspace of X such that T is of contractive class C(I) on Z. 

Moreover, Z is maximal in the following sense: ifW is a T-invariant Banach 
subspace of X such that T is of contractive class C(T) on W, then W is a Banach 
subspace of Z. 


Remark 9.15. With a minor change in the choice of A, Theorem 9.14 
generalizes (with identical statement and proof) to the case when T is an arbitrary 
compact subset of the plane with planar Lebesque measure zero. In this case, let 


270 9. Integral representation 


R(T) denote the algebra of all restrictions to T of rational functions with poles 
off [. Each f € R(T) can be written as a finite product 


FA) =7] JOA = aQ = Be), 


jk 
with y,a; € C and poles 6; ¢ T. Since 8; € p(T), we may define 


FL) =] ][@- 51) - 61). 
jk 


As before, we choose A to be the (multiplicative) semigroup 


A={f(T); f © Ri}, 


where R(T) is the set of all f € R(T) with || f|]p < 1. 

The corresponding space Z = Z(A) (the semi-simplicity space for T) is a 
T-invariant Banach subspace of X such that || f(T)|z||B(z) < 1 for all f © Ri(T). 
Since R(T) is dense in C(I) by the Hartogs—Rosenthal theorem (cf. Exercise 2), 
the map f > f(T)|z = f(T]z) has a unique extension as a norm decreasing 
algebra homomorphism of C(I) into B(Z), that is, T is of contractive class 
C(L) on Z. 


9.9 Resolution of the identity on Z 


Theorem 9.16. Let T be any compact subset of C for which the semi-simplicity 
space Z was defined earlier, for a given bounded operator T on the Banach space 
X, with spectrum in T (recall that, in the general case, T could be any compact 
subset of C with planar Lebesgue measure zero). Then T is of contractive class 
C(T) on Z, and Z is maximal with this property. 

Moreover, if X is reflexive, there exists a unique contractive spectral measure 
on Z, E, with support inT and values in T”, such that 


f(Tl2)e = | faE(.)x (1) 


for allx € Z and f € C(T), where f > f(T\z) denotes the (unique) C(T)- 
operational calculus for T|z. 

The integral (1) extends the C(I)-operational calculus for T|z in B(Z) to 
a contractive B(T)-operational calculus 7: f € BIT) > f(T|z) € B(Z), where 
([) stands for the Banach algebra of all bounded complex Borel functions on 
T with the supremum norm, and t(f) commutes with every U € B(X) that 
commutes with T. 


Proof. The first statement of the theorem was verified in the preceding sections. 
Suppose then that X is a reflexive Banach space, and let f > f(T'|z) denote 
the (unique) C(T)-operational calculus for T|z (in B(Z)). For each x € Z and 
x* € X*, the map 
fect) oa f(T\z)e EC 


9.9. Resolution of the identity on Z 271 


is a continuous linear functional on C(I) with norm <||z|| z||*|| (where we denote 
the Z-norm by || - ||z rather than || - ||4). By the Riesz representation theorem, 
there exists a unique regular complex Borel measure p = p(-;2,x2*) on B(C), 
supported by T, such that 


v'f(D\a)e= ff Faulsa,2" (2) 


and 
I|u(-3 2, @")|| < |lallz||o"|| (3) 
for all f € C(L), a € Z, and x* € X*. 

For each 6 € B(C) and x € Z, the map a* € X* > y(d; 2, x2*) is a continuous 
linear functional on X* with norm <||z||z (by (3)). Since X is reflexive, there 
exists a unique vector in X, which we denote E(6)x, such that p(d;2,2*) = 
xv* E(d)a. The map E(6) : Z > X is clearly linear and norm decreasing. 

If U € T’, then U commutes with f(T) for all f € R(T) (notation as in 
Section 9.8), hence U € A’. Let f € C(I), and let f, € R(T) converge to f 
uniformly on I. Since UZ C Z and ||U||acz) < ||U||Bcx) (cf. Theorem 9.11, 
Part (ii)), we have 


|U F(Z \z) — f(T \z)U aca) S$ WULF (Tz) — fa(Dllacz) 
+ lfn(D) — f(Lz)U lacs < 29 |lacollf(L lz) -— fn) lacs 
S 2 llacollf — frllea > 9. 
Thus, U commutes with f(T|z) for all f € C(I). Therefore, for all x € Z, 
v* € X*, and f € C(T), 
[ facrune =f Faure) BOx = Ue") F(Tle)e 
r r 


=2Uf(T|z)a =2* f(T\z)Uxr = [ fae BQQue. 


By uniqueness of the Riesz representation, it follows that UE(d) = E(d6)U for 
all 6 € B(C) (ie., E has values in T’”’). The last relation is true in particular for 
all U € A. Since A C Bi(Z) (by Theorem 9.11, Part (iii)) and E(0) : Z— X is 
norm decreasing, we have for each x € Z and 6 € B(C), 


|E(9)2||z := sup ||UE(6)x\] = sup ||E(6)Ua 
UcCA UECA 
< sup ||U2||z < |la|lz- 
UEA 
Thus, £(6) € By(Z) for all 6 € B(C). 
Since «*[E(-)a] = w(-; 2, 2*) is a regular countably additive complex measure 


on B(C) for each x* € X*, E satisfies Condition (1) of a spectral measure on Z 
and has support in I’. We may then rewrite (2) in the form 


f(t \a)e= ff faB( x fect), «eZ, (4) 


272 9. Integral representation 


where the integral is defined as in Section 9.2. 

Taking f = fo(= 1) in (4), we see that E(C) = E(T) = Iz. 

Since f > f(T|z) is an algebra homomorphism of C(I) into B(Z), we have 
for all f,g € C(T) and x € Z (whence g(T|z)x € Z): 


[ faBola(tl2\e) = Fla lol P2)e 
=(fa(Tlae = f toaB(e. 
By uniqueness of the Riesz representation, it follows that 


dE(:)[g(T|z)a] = g dE(-)a. 


This means that for all 6 € B(C),g € C(L), and x € Z, 


E(5)[9(T|z)] = i xa9 dE(-)ar, (5) 


where x5 denotes the characteristic function of 6. 

We observed that E(d) commutes with every U € B(X) that commutes with 
T. In particular, E(6) commutes with g(T) for all g € R(T). If g € C(L) and 
gn € R(T) > g uniformly on T, then since E(d) € By(Z), we have 


|E()g(T\z) — g(T|z) E()||B(z) 
S||E@)o(Tlz) — mF az )llacz) + Mgn(T lz) — 9(Flz)EO|lacz 
< 2\l9 — gnllew) > 9. 


Thus £(6) commutes with g(T|z) for all g € C(T) and 6 € B(C). 
We can then rewrite (5) in the form 


| oxs dE a= | 9 dE(.)E(6)a 
T i ig 


for all x € Z,6 € B(C), and g € C(T). 
Again, by uniqueness of the Riesz representation, it follows that 


ysdE(-)x = dE(-)E(5)a. 
Therefore, for each « € B(C) and x € Z, 
B(O)E(5)x = i xe dE(-)E(8)x = | xexsdE( Je 


= [xs dE(.)x = E(end)a. 


9.9. Resolution of the identity on Z 273 


We consider now the map f € B(T) > J. f dE(-)a € X. Denote the integral by 
T(f)a. We have for all x € Z and f € B(T) 


rel = sup [| fae" BC)at 


a*eX*s||x*||=1 
S |lellz sup [/1. (6) 
If U € T’, we saw that UE(6) = E(d)U for all 6 € B(C). It follows from 
the definition of the integral with respect to E(-)x (cf. Section 9.2) that for each 


LEZ 
Ur(f)a = 7(f)Uc. 


In particular, this is true for all U € A. Hence by (6) and Theorem 9.11, Part (iii), 
II7(fzllz = sup ||U7r(f)xl] = sup ||r(f)U a 
UEA UEA 


< sup|f| sup ||U2||z < ||fllaqy||z\lz- 
rT UEA 


Thus r(f) € B(Z) for all f € B(I). Furthermore, the map 7 : B(T) > 
B(Z) is linear and norm-decreasing, and clearly multiplicative on the simple Borel 
functions on T. A routine density argument proves the multiplicativity ofr on B(T). 

The uniqueness statement about FE is an immediate consequence of the 
uniqueness of the Riesz representation. 


Remark 9.17. Let A be a closed interval. The semi-simplicity space for T’ € 
B(X) is the smallest Banach subspace Wo in the increasing scale {W,,;3m = 
0,1,2,...} of Banach subspaces defined next. 

Let C™(A) denote the Banach algebra of all complex functions with 
continuous derivatives in A up to the mth order, with pointwise operations 
and norm 


I fllm = sup |f|/J}. 
j=0 


We apply Theorem 9.11, Part (iii), to the multiplicative semigroup of 
operators 
Am *= {p(T);p € P, |lpllm < 1}, 
where P denotes the polynomial algebra C[¢] and m = 0,1,2,.... 


Fix m € NU {0}. By Theorem 9.11, the Banach subspace W,, := Z(Am) is 
Am-invariant and Ap,|Wim C Bi(Wm), that is, 


IP(T)llwin S ellwen llPlln 


for all p € P and « € W,,. By density of P in C™(A), it follows that 
Tlw,, ts of contractive class C™(A), that is, there exists a norm-decreasing 
algebra homomorphism of C™(A) into B(W,,,) that extends the usual polynomial 
operational calculus. Moreover, W,, is maximal with this property. 


274 9. Integral representation 


9.10 Analytic operational calculus 


Let K C C be a non-empty compact set, and denote by H(K) the (complex) 
algebra of all complex functions analytic in some neighborhood of K (depending 
on the function), with pointwise operations. A net fa, C H(K) converges to f if 
all fa are analytic in some fixed neighborhood 2 of K, and fy — f pointwise 
uniformly on every compact subset of Q. When this is the case, f is analytic 
in Q (hence f € H(k)), and one verifies easily that the operations in H(K) 
are continuous relative to the described convergence concept (or, equivalently, 
relative to the topology associated with the described convergence concept). 
Thus, H(K) is a so-called topological algebra. 

Throughout this section, A denotes a fixed (complex) Banach algebra with 

unit e. Let F be a topological algebra of complex functions defined on some 
subset of C, with pointwise operations, such that f,(A) := A* € F for k = 0,1. 
Given a € A, an F-operational calculus for a is a continuous representation 
tT: f + t(f) of F into A such that 7(fi) = a (in the present context, a 
representation is an algebra homomorphism sending the identity fo of F to the 
identity e of A). 
Notation 9.18. Let K CQ CC, K compact and Q open. There exists an open 
set A with boundary [ consisting of finitely many positively oriented rectifiable 
Jordan curves, such that K C A and AUT Cc Q. Let ['(K,Q) denote the family 
of all T's with these properties. 


If F is an A-valued function analytic in ON KS and T,I” € T(K,Q), it follows 
from (the vector-valued version of) Cauchy’s theorem that 


[Fo d\= | F(A). 
T I’ 


We apply this observation to the function F = f(-)R(-;a) when o(a) C K and 
f € H(4&), to conclude that the so-called Riesz—Dunford integral 


: Ama f 100K R(A;a) dA (1) 


(with [T € T(K,Q) and f analytic in the open neighborhood Q of K) is a well- 
defined element of A (independent on the choice of I!). 


Theorem 9.19. The element a € A has an H(K)-operational calculus iff 
a(a) C K. In that case, the H(K)-operational calculus for a is unique, and 
is given by (1). 


Proof. Suppose o(a) C K. For f € H(K), define r(f) by (1). As observed, 
7: H(K) > Ais well defined and clearly linear. Let f,g € H(K); suppose both 
are analytic in the open neighborhood 2 of K. Let I’ € [(K,Q), and let A’ 
be the open neighborhood of kK with boundary I’. Choose I’ € ['(K, A’) (hence 
Tl €[(K,Q)). By Cauchy’s integral formula, 


af Su =o0) Wen), @) 


9.10. Analytic operational calculus 275 


and 
[ma-o wel’). (3) 
pA-p 


Therefore, by the resolvent identity for R(-) := R(-;a), Fubini’s theorem, and 
Relations (2) and (3), we have 


eee dX | g(w)R(u) du 


=f [fora ane 


= g(u) f(A) 
= [ FeyR0) ¥ wars | rw f < aran 
=2ni f f(d)g(A)RO) AA = (2ni)?r(F9). 

T 


This proves the multiplicativity of 7. 

Let {fa} be a net in H(A) converging to f, let Q be an open neighborhood 
of K in which all fy (and f) are analytic, and let T € T'(K,Q). Since R(-) is 
continuous on I’, we have M := supr ||R(-)|| < co, and therefore, denoting the 
(finite) length of T by |I'|, we have 


Ifa) =A) = Irlfa — Dh S El sup fa = F1 +0 


since f, — f uniformly on T. This proves the continuity of rT: H(i) > A. 

If f is analytic in the disc A, := {\;|A| < r} and Kk Cc A,, the (positively 
oriented) circle C, = {A;|A| = p} is in T(K,A,) for suitable p < r, and for any 
a € A with spectrum in K, the Neumann series expansion of R(X) converges 
uniformly on C,. Integrating term by term, we get 


oe) FOO 
a”. 4 

=e Qi C, on aaa (4) 

In particular, T(A\*) = a* for all k = 0,1,2,..., and we conclude that 7 is an 


H(K)-operational calculus for a. 

If 7’ is any H(K)-operational calculus for a, then necessarily 7/(A”) = a* for 
allk =0,1,2,..., that is, 7’ coincides with 7 on all polynomials. If f is a rational 
function with poles in K°, then f ¢ H(K). Writing f = p/q with polynomials p, q 
such that ¢g #0 on Kk ea that 1/q¢ € H(K)!), we have e = 7'(1) = 7'(q- (1/q)) = 
t'(q)T'(1/q), hence, 7’(q) is non-singular, with inverse r’(1/q). Therefore 7’(f) = 
r'(p)r'(A/a) = 1(p)r'(q)~! = t(p)r(q)-!_= 7(f). By Runge’s theorem (cf. 
Exercise 1), the rational functions with poles in AK are dense in H(K), and 
the conclusion 7’(f) = 7(f) for all f € H(K) follows from the continuity of both 
7’ and Tr. This proves the uniqueness of the H(k)-operational calculus for a. 


276 9. Integral representation 


Finally, suppose a € A has an H(K)-operational calculus 7, and let p € K°. 
The polynomial g(A) := « — » does not vanish on K, so that (as was proved 
earlier) 7(q)(= we—a) is non-singular, and R(u; a) = 7(1/(u—A)). In particular, 
o(a) CK. 


Let 7 be the H(c(a))-operational calculus for a. Since r(f) = f(a) when f 
is a polynomial, it is customary to use the notation f(a) instead of r(f) for all 


f € H(o(a)). 


Theorem 9.20 (Spectral mapping theorem). Let a¢ A and f € H(o(a)). 
Then o(f(a)) = f(o(a)). 


Proof. Let 4 = f(A) with A € o(a). Since X is a zero of the analytic function 
ju — f, there exists h € H(a(a)) such that 


uw fC) = A= Qa) 
in a neighborhood of o(a). Applying the H(o(a))-operational calculus, we get 
ye — f(a) = (Ae — a)h(a) = h(a) (Ae — a). 
If w € p(f(a)), and v = R(p; f(a)), then 
(Ae — a)[A(a)u] = e and [vh(a)](Ae — a) =e, 


that is, A € p(a), contradiction. Hence, 4 € o(f(a)), and we proved the inclusion 
f(o(a)) C o(f(a)). 

If uw ¢ f(o(a)), then p — f # 0 on o(a), and therefore g :=1/(u—f) € 
H(o(a)). Since (u—f)g = 1 ina neighborhood of o(a), we have (we—f(a))g(a) = 
e, that is, 4 € p(f(a)). This proves the inclusion o(f(a)) C f(a(a)). 


Theorem 9.21 (Composite function theorem). Let a € A, f € H(o(a)), 
and g € H(f(o(a))). Then go f € H(o(a)) and (go f)(a) = g(f(a)). 


Proof. By Theorem 9.20, g(f(a)) is well defined. 
Let 2 be an open neighborhood of K := f(a(a)) = o(f(a)) in which g is 
analytic, and let T ¢ [(4,Q). Then 


a f(a) = — fi ou) R(u; f(a) ds. (5) 


Oni Ip 
Since [ C K‘°, for each fixed » €T, the function 4 — f does not vanish on o(a), 
and consequently k, :=1/(u— f) € H(o(a)). The relation (yu — f)k, = 1 (valid 


in a neighborhood of o(a)) implies through the operational calculus for a that 
ky(a) = R(ys; f(a)), that is, 


Rui Fla) = =f by(X)ROsa) dd (6) 


277 


9.11. Isolated points of the spectrum 277 


for a suitable I’. We now substitute (6) in (5), interchange the order of 
integration, and use Cauchy’s integral formula: 


eniPa(sa)) = fou) f kularROva)ardy = ff OR dutecrsa) ar 


=2ni fg f2))ROsa) d= 2nd?*(Go f(a). 


9.11 Isolated points of the spectrum 


Construction 9.22. Let w be an isolated point of o(a). There exists then 
a function e, € H(a(a)) that equals 1 in a neighborhood of 4 and 0 in a 
neighborhood of o, := o(a)M {p}°. Set E, = e,,(a). The element E,, is 
independent of the choice of the function e,,, and since ey = e, ina neighborhood 
of o(a), it is an idempotent commuting with a. 

Let 6 = dist(u,0,,). By Laurent’s theorem (whose classical proof applies word 
for word to vector-valued functions), we have for 0 < |A — | < 6: 


RQA;a)= J) ag(u— dr), (1) 


where 


ap =—s— | (u—A)-* "ROA a) dd, (2) 


and I is a positively oriented circle centered at jz with radius r < 6. Choosing 
a function e, as shown that equals 1 in a neighborhood of the corresponding 
closed disc, we can add the factor e, to the integrand in (2). For k € —N, 
the new integrand is analytic in a neighborhood 2 of o(a), and therefore, by 
Cauchy’s theorem, the circle T may be replaced by any I’ € T'(a(a),). By the 
multiplicativity of the analytic operational calculus, it follows that 


ap =—(ue—a)* 1B, (kEN). (3) 


In particular, it follows from (3) that a_, = 0 for all k > ko iff a_,, = 0. 
Consequently the point pz is a pole of order m of R(-;a) iff 


(ue—a)"E,,=0 and (we—a)” EB, 40. (4) 


Similarly, R(-;a) has a removable singularity at yu iff E,, = 0. In this case, the 
relation (Ae—a)R(A; a) = R(A;.a)(Ae— a) = e extends by continuity to the point 
A = p, so that p € p(a), contradicting our hypothesis. Consequently E,, 4 0. 

These observations have a particular significance when A = B(X) for a 
Banach space X. If yw is an isolated point of the spectrum of T € B(X) and E,, 
is the corresponding idempotent, the non-zero projection E,, (called the Riesz 
projection at for T) commutes with T, so that its range X,, A {0} is a reducing 
subspace for T (cf. Terminology 8.5 (2)). 


278 9. Integral representation 


Let T, := T|x,. If ¢ A wu, the function h(A) := e,(A)/(¢ — A) belongs to 
H(o(T)) for a proper choice of e,,, and (¢ — A)A(A) = e,,. Applying the analytic 
operational calculus, we get 


(CI —T)A(L) = ACT) (CI —T) = Ey, 
and therefore (since h(T)X,, C X,,), 
(1-7, )MT)x = h(T(GI—T,)z= 2 (w EX). 


Hence, ¢ € p(T,,), and consequently o(T,,) C {yw}. Since X, 4 {0}, the spectrum 
of T,, in non-empty (cf. Theorem 7.6), and therefore 


o(Ty) = {H}- (5) 


Consider the complementary projection Ey, := I — E,,, and let Xj, := E/,X 
and T/, := T|x,. The argument (with h(\) := (1—e,)/(¢—A) for a fixed ¢ ¢ o,) 
shows that o(T/,) C o,. If the inclusion is strict, pick ¢ € a, p(Z7/,). Then 
¢ Ap, so that 4R(¢; T,,) (by (5)), and of course 4R(¢; T7,). Let 


V = RGT,) Ba + RG TEL. (6) 


Clearly, V € B(X), and a simple calculation shows that (¢I — T)V = V(¢I —T) 
=I. Hence, ¢ € p(T), contradicting the fact that ¢ € o,,(C o(T)). Consequently 


o(T,) = op. (7) 


We also read from (6) (and the shown observation) that V = R(¢;T) for all 
¢ € o(T), that is, 


R(C;T,) = REGT)|x, (¢ #4); (8) 
R(T) =RGT)[x, (C € o,)- (9) 


(Rather than discussing an isolated point, we could consider any closed subset 
o of the spectrum, whose complement in the spectrum is also closed; such a set 
is called a spectral set. The previous arguments and conclusions go through with 
very minor changes.) 

By (4), the isolated point y of o(T) is a pole of order m of R(-;T) iff 


(wf — T)"X, = {0} and (wl —T)""1X,, # {0}. (10) 


In this case, any non-zero vector in the latter space is an eigenvector of T for 
the eigenvalue p, that is, 4 € op(T). 
By (10), X, C ker(ul — T)™. Let x € ker(wI —T)™. Since E,, commutes 
with T, 
(wf — T.)" (I — B, x = (I - Ey, )(ul — T)a = 0. 
By (7), # € p(T/,), so that ul — T/, is one-to-one; hence, (J — E,,)x = 0, and 
therefore, x € X,,, and we conclude that 


X,, = ker(ul —T)”. (11) 


9.12. Compact operators 279 


9.12 Compact operators 


Definition 9.23. Let X be a Banach space, and denote by S its closed unit 
ball. An operator T € B(X) is compact if the set T'S is conditionally compact. 


Equivalently, TT’ is compact iff it maps bounded sets onto conditionally 
compact sets. 

In terms of sequences, the compactness of T is characterized by the property: 
if {x,,} is a bounded sequence, then {Tx,,} has a convergent subsequence. 

Denote by K(X) the set of all compact operators on X. 


Proposition 9.24. 


(i) K(X) is a closed two-sided ideal in B(X). 
(ti) K(X) = B(X) iff X has finite dimension. 
(iti) The restriction of a compact operator to a closed invariant subspace is 
compact. 


Proof. (i) K(X) is trivially stable under linear combinations. Let T € 
K(X), A € B(X), and let {z,} be a bounded sequence. Let then {T'rn,} be 
a convergent subsequence of {Tz,}. Then {AT z,, } converges (by continuity of 
A), that is, AT € K(X). Also {Az,} is bounded (by boundedness of A), and 
therefore {T Az, } converges for some subsequence, that is, TA € K(X), and 
we conclude that K(X) is a two-sided ideal in B(X). 

Suppose {T,,} C K(X) converges in B(X) to T. Let {x,,} C X be bounded, 
say, ||@n|| < M for all n. By a Cantor diagonal process, we can select a 
subsequence {2n,, } of {x} such that {T,,7,, }, converges for all m. Given « > 0, 
let mo € N be such that ||T — Ty,|| < €/(4M) for all m > mo. Fix m > mo, and 
then ko = ko(m) such that ||Tnan, —Tm2n,|| < €/2 for all k, 7 > ko. Then for all 
k,j > ko, 


|T Zn, = T2n,|l a= Tim) (In, md In,;)|| Te |Tintn, a Tim&n, | 
< [e/(4M)]2M + €/2 =e, 


and we conclude that {T2x,,,} converges to some element y. 
Hence, 
limsup ||Tzn, — yl| <«¢, 
k 


and therefore, Tz, — y by the arbitrariness of e. 
(ii) If X has finite dimension, any linear operator T on X maps bounded sets 
onto bounded sets, and a bounded set in X is conditionally compact. 
Conversely, if K(X) = B(X), then, equivalently, the identity operator I is 
compact, and therefore the closed unit ball S = IS is compact. Hence, X has 
finite dimension, by Theorem 5.27. 
The proof of (iii) is trivial. 


280 9. Integral representation 


Theorem 9.25 (Schauder). T € K(X) iff T* © K(X*). 


Proof. (1) Let T € K(X), and let {x*} be a bounded sequence in X™%, say, 
\|z*|| < M for all n. Then, for all n,|2*2| < M||T|| for all x € TS and |x*a — 
xy| < M||a — y|| for all x,y € X, that is, the sequence of functions {27} is 
uniformly bounded and equicontinuous on the compact metric space T'S. By 
Theorem 5.48 (also cf. Exercise 3), there exists a subsequence {x7} of {x7} 


converging uniformly on T'S. Hence, 


sup |r,,(T'z) —2,,(Tx)| +0 (k,j > 0), 
LES 


that is, ||T*2;,, —T*x7,,|| + 0 as k, j + oo, and consequently {T*x7,, } converges 
(strongly) in X*. This proves that T* is compact. 

(2) Let T* be compact. By Part (1) of the proof, T** is compact. Let 
{tn} CS. Then ||Z,|| = ||v,|| < 1, and therefore, T**#,, converges in X** 
for some 1 < ny < no <---, that is, 


sup [ae ae ee | 0. Ch, gece). 
||a*|]=1 


Equivalently, 
sip.||a tra a Lay, | 2 0, 
||e*|=1 


that is, ||Trn, —T2n,|| + 0 as k,j — oo, and consequently T is compact. 


Lemma 9.26. Let Y be a proper closed subspace of the Banach space X. Then 
suPprex, U(x, Y) =1. (X1 denotes the unit sphere of X.) 


Proof. Let 1 > « > 0. If d(z,Y) = 0 for all x € Xj, then since Y is closed, 
X, C Y, and therefore, X Cc Y (because Y is a subspace), contrary to the 
assumption that Y is a proper subspace of X. Thus there exists x, € X, such 
that 6 := d(a1,Y) > 0. By definition of d(a1,Y), there exists y; € Y such that 
(6 <)d(ai,y1) < (1+ 6)d. Let u = a; — yy and & = u/|lul|. Then « € X, 
||u|| < (1 + €)6, and for ally €e Y 


(1+ €)dl|a — yl| > |lelllla — yl] = [lu — llelyl] = ler — (a + Melly] = 6 


Hence, ||a — y|| > 1/(1 +€) > 1-e for all y € Y, and therefore d(x, Y) > 1-. 
Since we have trivially d(w,Y) < 1, the conclusion of the lemma follows. 


Theorem 9.27 (Riesz—Schauder). Let T be a compact operator on the Banach 
space X. Then 


(i) o(T) is at most countable. If {4,} is a sequence of distinct non-zero points 
of the spectrum, then fin > 0. 


(ti) Each non-zero point 4 € o(T) is an isolated point of the spectrum, and 
is an eigenvalue of T and a pole of the resolvent of T. If m is the order 
of the pole u, and KE, is the Riesz projection for T at 1, then its range 


9.12. Compact operators 281 


E,,X equals ker(ul — T)™ and is finite dimensional. In particular, the 
L-eigenspace of T is finite dimensional. 


Proof. (1) Let y be a non-zero complex number, and let {x,} C X be such 
that (ul —T)x, converge to some y. If {x,,} is unbounded, say 0 < ||2,|| - oo 
without loss of generality (w.l.o.g.), consider the unit vectors z, := %p/||xn]l. 


Since T is compact, there exist 1 < m1 < ng,--- such that Tz,, - v € X. Then 
alt * 
Lin, = lz yp Ht — Piene +Tzn, > Oytv=v. (*) 
Nk 
Hence 


pv = lim T("2n,,) = Tv. 


If w ¢ o,(T), we must have v = 0. Then by (*) || = ||uWzn, || + 0, a contradiction. 
Therefore (if 4 ¢ o,(T)!), the sequence {x,,} is bounded, and has therefore a 
subsequence {2p } such that dlim;, Tx, := u. Then as shown 


in, =H *[(ul —T) an, + Ttn,] > wo (y +4) = 2, 


and therefore y = lim, (uJ —T)a¢,, = (ul —T)a € (ul —T)X. Thus (ul — T)X 
is closed. This proves that a non-zero p is either in o,(T) or else the range of 
LI — T is closed. In the later case, if this range is dense in X, pl — T is onto 
(and one-to-one!), and therefore uw € p(T). If the range is not dense in X, it is 
a proper closed subspace of X; by Corollary 5.4. there exists «* # 0 such that 
x* (ul — T)x = 0 for all x € X, that is, (uJ — T*)x* = 0. Thus p € o,(T*). In 
conclusion, if 4 € o(T) is not zero, then us € o,(T) Uo,(T*). 

(2) Suppose fin,n = 1,2,..., are distinct eigenvalues of T that do not 
converge to zero. By passing if necessary to a subsequence, we may assume that 
\tn| > € for all n, for some positive ¢. Let x, be an eigenvector of T corresponding 
to the eigenvalue w,. Then {x,} is necessarily linearly independent (an 
elementary linear algebra fact!). Setting Y, := span{21,...,¢%},Yn—1 is 
therefore, a proper closed T-invariant subspace of the Banach space Y,, , and 
clearly (ft, —T)Y, C Yn-1 (for all n > 1). By Lemma 9.26, there exists yn, € Yn 
such that ||yn|| = 1 and d(yn, Yn—1) > 1/2, for each n > 1. Set 2, = Yn/pn. Since 
||Zn|| < 1/e, there exist 1 < ny < ng <--- such that Tz, converges. However, 
for 7 > k, 


|T2n, — Tzn,|| = Il9n; [(Hn, 1 T)Zn, + T2n,]|| > 1/2, 


since the vector is square brackets belongs to Y,,,-1, contradiction. This proves 
that if {uu} is a sequence of distinct eigenvalues of T, then Un — 0. 

(3) Suppose up € o(T), uw 4 0, is not an isolated point of the spectrum, and 
let then pn, (n € N) be distinct non-zero points of the spectrum converging to pu. 
By the conclusion of Part (1) of the proof, {jun} C op(T) Uop(I™*). Since the set 
{ln} is infinite, at least one of its intersections with o,(T) and o,(7") is infinite. 
This infinite intersection converges to zero, by Part (2) of the proof (since both 
T and T* are compact, by Theorem 9.25). Hence = 0, a contradiction! This 


282 9. Integral representation 


shows that the non-zero points of o(T) are isolated points of the spectrum. Since 
o(T) is compact, it then follows that it is at most countable. 

(4) Let » A 0,u € off), and let E,, be the Riesz projection for T at (the 
isolated point) jz. As before, let X, = E,X and T, = T|x,,. Let S$, denote the 
closed unit ball of X,,. Since o(T,,) = {u} (cf. (5) of Section 9.11), we have 0 € 
o(T,,), that is, ITT € B(X,,), and consequently T7'S,, is bounded. The latter’s 
image by the compact operator T,, (cf. Proposition 9.24 (iii)) is then conditionally 
compact; this image is the closed set S,,, hence S,, is compact, and therefore X,, 
is finite dimensional (by Theorem 5.27). Since o(pI — T,,) = uw — o(T,) = {0} 
by (5) of Section 9.11, the operator ul — T,, on the finite dimensional space 
X,, is nilpotent, that is, there exists m € N such that (ul — T,,)” = 0 but 
(ul — T,)™—' 40. Equivalently, 


(uI —T)"E,=0 and (uI—-T)”™ +E, £0. 


By (4) of Section 9.11, uw is a pole of order m of R(-;T), hence an eigenvalue of 
T (cf. observation following (10) of Section 9.11), and ker(uJ —T)™ = X,, by 
(11) of Section 9.11. 


Exercises 


[The first two exercises provide the proofs of theorems used in this chapter.| 


Runge’s theorem 


1. Let S? = C denote the Riemann sphere, and let K C C be compact. Fix 
a point a; in each component V; of S? — K, and let R({a;}) denote the 
set of all rational functions with poles in the set {a,}. 

If is a complex Borel measure on K, we define its Cauchy transform 


pu by ; 
a) = [| PO esr, (1) 


Wz 


Prove 
(a) ji is analytic in S? — K. 


(b) For a; 4 oo, let dj = d(a,;, K) and fix z € B(a;,r) C V; (necessarily 
r <d;). Observe that 


es (2) 


and the series converges uniformly for w € K. 
For a; = oo, we have 


1 Sw" 
——=-)° > (>, (3) 
n=0 


and the series converges uniformly for w € K. 


Exercises 283 


) If fx hdu = 0 for all h € R({a;}), then ji(z) = 0 for all z € B(aj,r), 
hence, for all z € Vj, for all j, and therefore ji = 0 on S? — K. 

(d) Let Q Cc C be open such that kK Cc Q. If f is analytic in Q and 
uw is as in Part (c), then J, fdu = 0. (Hint: represent f(z) = 
(1/2n7) f, f(w)/(w — z)dw for all z € K, where P € I'(K,Q), cf. 
Notation 9.18, and use Fubini’s theorem.) 

(e) Prove that R({a;}) is C(d)-dense in H(Q) (the subspace of CK) 
consisting of the analytic functions in Q restricted to K). (Hint: 
Theorem 4.9, Corollary 5.3, and Part (d).) The result in Part (e) 
is Runge’s theorem. In particular, the rational functions with poles 
off K are C(ic)-dense in H(Q). 

(f) If S? — K is connected, the polynomials are C(K)-dense in H(Q). 
(Hint: apply Part (e) with a = oo in the single component of S$? — K.) 


Hartogs—Rosenthal’s theorem 


2. Use notation as in Exercise 1. Let m denote the R?-Lebesgue measure. 


(a) The integral defining the Cauchy transform ji converges absolutely 
m-a.e. (Hint: show that 


d 
LJ ai td dy < 00 
RIK |w—2| 


by using Tonelli’s theorem and polar coordinates.) 


(b) Let R(AK) denote the space of rational functions with poles off 
K. Then f,,hdw = 0 for all h € R(K) iff f@ = O off K. 
(Hint: use Cauchy’s formula and Fubini’s theorem for the non-trivial 
implication.) 

(c) It can be shown that if fj = 0 m-a.e., then u = 0. Conclude that 
if m(K) = 0 and yp is a complex Borel measure on K such that 
Jig hd = 0 for all h € R(X), then pp = 0. Consequently, if m(K) = 0, 
then R(x) is dense in C'(K) (cf. Theorem 4.9 and Corollary 5.6). This 
is the Hartogs—Rosenthal theorem. 


Arzela—Ascoli’s theorem 


We give an alternative, “sequential” proof of Theorem 5.48, which is closer 
to the sequential nature of the proof of Theorem 9.25. 


3. Let X be a compact metric space. Recall from Section 5.9 that a set 
F Cc C(X) is equicontinuous if for each « > 0, there exists 6 > 0 
such that |f(x) — f(y)| < ¢ for all f € F and 2,y € X such that 
d(x,y) < 6, and it is pointwise bounded if suprex|f(x)| < oo for each 


284 


9. Integral representation 


xz € X. Prove that if F is pointwise bounded and equicontinuous, then 
it is relatively compact in C(X). Sketch: X is necessarily separable. 
Let {a,} be a countable dense set in X. Let {fn} C F. {fn(ar)} is a 
bounded complex sequence; therefore there is a subsequence {fn 1} of 
{fn} converging at a1; {fn1(a2)} is a bounded complex sequence, and 
therefore there is a subsequence {fn2} of {fn} converging at a2 (and 
a,). Continuing inductively, we get subsequences {f,,} such that the 
(r+1)-th subsequence is a subsequence of the rth subsequence, and the rth 
subsequence converges at the points a,,...,a,. The diagonal subsequence 
{fnn} converges at all the points a,. Use the compactness of X and an 
e/5 argument to show that {fn} is Cauchy in C(X). 


Compact normal operators 


4. Let X be a Hilbert space, and T € K(X) be normal. Prove that there 


exist a sequence {A,,} € co and a sequence {E,,} of pairwise orthogonal 
finite rank projections such that ey AnEn > T in B(X) as N > ov. 


This is the spectral theorem for compact normal operators. 


Logarithms of Banach algebra elements 


5. Let A be a unital Banach algebra, and let x € A. Prove: 


(a) If 0 belongs to the unbounded component V of p(x), then x € 
exp A(:= {e%; a € A}) (that is, x has a logarithm in A). Hint: Q := V° 
is a simply connected open subset of C containing o(x), and the 
analytic function f(A) = A does not vanish on 2. Therefore, there 
exists g analytic in such that e? = f; (cf. Exercise 28 of Chapter 7). 


(b) The group generated by exp A is an open subset of A. 


6. Let A be a unital Banach algebra, and let G. denote the component of 


G := G(A) containing the identity e. Prove: 


G, is open. 
G, is a normal subgroup of G. 
expA CG. 


UexpA---expA (the union of all finite products) is an open subset 
of Ge (cf. Exercise 5(b)). 


(e) Let H be the group generated by exp.A. Then H is an open and 
closed subset of Ge. Conclude that H = G¢. 


(f) If A is commutative, then G. = exp A. 


Exercises 285 
Non-commutative Taylor theorem 
7. Use notation as in Exercise 10, Chapter 7. Let A be a unital Banach 


algebra, and let a,b € A. Prove the following non-commutative Taylor 
theorem for each f € H(a(a,b)): 


j=0 I 
08 “) 
=Yictavg= 
4=0 
In particular, if a,b commute, 
“) 
£0) = Oo a} (* 


for all f € H(a(a,b)), where (in this special case) 
a(a,b) = {A € C3 d(A, o(a)) < r(b—a)}. 


If b— a is quasi-nilpotent, (*) is valid for all f € H(o(a)). 


Positive operators 


8. Let X be a Hilbert space. Recall that T € B(X) is positive (in symbols, 
T > 0) iff (2,2) > 0 for all x € X. Prove: 


(a) The positive operator T is non-singular (i-e., invertible in B(X)) iff 
T — eI > 0 for some € > 0 (one can also write T > eI to express the 
last relation). 


(b) The (arbitrary) operator T is non-singular iff both TT* > eI and 
T*T > el for some € > 0. 


9. Let X be a Hilbert space, T € B(X). Prove: 
(a) If T is positive, then 
(Tz, y)|? < (Lx, x)(Ty,y) for all ay € X. 


(b) Let {T,} C B(X) be a sequence of positive operators. Then T;, + 0 
in the s.o.t. iff it does so in the w.o.t. 

(c) fO0< T, < Ty41 < KI for all k (for some positive constant K’), then 
{T; } converges in B(X) in the s.o.t. 


286 9. Integral representation 


Analytic functions operate on A 


10. Let A be a unital commutative Banach algebra, and a € A. Let f € 
H(o(a)). Prove that there exists b € A such that 6 = f oa (@ denotes 
the Gelfand transform of a.) In particular, if @ vanishes nowhere, there 
exists b € A such that 6 = 1/4. (This is Wiener’s theorem.) (Hint: use 
the analytic operational calculus.) 


Polar decomposition 


11. Let X be a Hilbert space, and let T € B(X) be non-singular. Prove that 
there exist a unique pair of operators $,U such that S is non-singular 
and positive, U is unitary, and JT = US. If T is normal, the operators 
S,U commute with each other and with T. (Hint: assuming the result, 
find out how to define S$ and U|sx; verify that U is isometric on SX, etc. 
See Chapter 12 for an extension of this exercise.) 


Cayley transform 


12. Let X be a Hilbert space, and let T € B(X) be selfadjoint. Prove: 
(a) The operator V := (T+ il)(T —il)~! (called the Cayley transform 
of T) is unitary and 1 ¢ o(V). 


(b) Conversely, every unitary operator V such that 1 ¢ o(V) is the Cayley 
transform of some selfadjoint operator T € B(X). 


Riemann integrals of operator functions 


13. Let X be a Banach space, and let T(-) : [a,b] ~ B(X) be strongly 
continuous (that is, continuous with respect to the s.o.t. on B(X)). 
Prove: 


(a) ||Z'(-)|| is bounded and lower semi-continuous (l.s.c.) (cf. Exercise 6, 
Chapter 3). 

(b) For each x € X, the Riemann integral i T(t)adt is a well- 
defined element of X with norm < i \|T'(t)|| dt||x||. Therefore the 
operator i T(t) dt defined by Ge Tides = A T(t)x dt has norm 
< f° \|T(@)||dt. For each S$ € B(X),ST(-) and T(.)S' are strongly 
continuous on [a,b], and S {’ T(t) dt = [? ST(t) dt; (f? T(t) dt)S = 
[2 T()S dt. 

(c) (f; T(s) ds)/(c) = T(c) (derivative in the s.o.t.). 


Exercises 287 


(d) If T(-) = V’(-) (derivative in the s.o.t.) for some operator function V, 
then [’ T(t) dt = V(b) — V(a). 

(e) If T(-) : [a,c0) + B(X) is strongly continuous and f~ ||T(¢)|| dt < 
oo, then limp_c. if T(t) dt := J T(t) dt exists in the norm topology 
of B(X), and || [-° T(t) dt|| < f° ||T(]| dt. (Note that ||T(-)]| is Ls.c. 
by Part (a), and the integral on the right makes sense as the integral 
of a non-negative Borel function.) 


Semigroups of operators 


14. Let X be a Banach space, and let T(-) : [0,00) - B(X) be such that 
T(t +s) = T(t)T(s) for all t,s > 0 and T(0) = JI. (Such a function is 
called a semigroup of operators.) Assume T(-) is (right) continuous at 0 
in the s.o.t. (briefly, T(-) is a Co-semigroup). Prove: 


(a) T(-) is right continuous on [0,00), in the s.o.t. 


(b) Let cy, := sup{||T(t)||;0 < t < 1/n}. Then there exists n such that 
Cn < oo. (Fix such an n and let c := c,(> 1).) (Hint: the uniform 
boundedness theorem.) 


(c) With n and c as in Part (b), ||T(t)|]| < Me on [0,00), where M := 
c"(>1) and a := log M(>0). 


(d) T(-) is strongly continuous on (0, co). 


(ec) Let V(t) := f, T(s)ds. Then 


T(h)V(t) =V(t+h)—V(h) (h,t > 0). 


Conclude that (1/h)(T(h)—I)V (t) — T(t)—T in the s.o.t., as h > 0+ 
(i.e., the strong right derivative of T(-)V(t) at 0 exists and equals 
T(t) — I, for each t > 0). (Hint: Exercise 13, Part (c).) 

(f) Let w := infso 7! log |/T'(t)||. Then w = limz_,.. t7! log ||T'(t)|| (<co) 
(cf. Part (c)). (Hint: fix s > 0 and r > s~!log||T(s)||. Given t > 0, 
let n = [t/s]. Then t~* log ||T(t)|| < rns/t + t~* supyo,g log ||T(-)II-) 
(w is called the type of the semigroup T(-).) 

(g) Let w be the type of T(-). Then the spectral radius of T(t) is e*', for 
each t > 0. 


More on the spectral theorem 


Let T be a normal operator on a Hilbert space X and E be its resolution 
of the identity. 


15. Prove that every isolated point in o(T) is an eigenvalue of T. (Remark 
that the converse is false.) 


288 


16. 


17. 


18. 


19. 


9. Integral representation 


(a) Prove that for each r > 0, the range of E({A € C;|A| > r}) is 
contained in the range T(X) of T. 


(b) Prove that the range of E(C\{0}) is equal to T(X). 


Assume that the set of eigenvectors of T spans a dense subspace of X. 


Prove that 
T= SY) dB({A}) 


AEop(T) 


in the strong operator topology (s.o.t.); that is, denoting by F the set of all 
finite subsets of o,(T) ordered by inclusion, the net (>\ep AE({A})) nee 
converges to T in the s.o.t. (Hint: to prove that the series converges use 
(the net version of) Part (c) of Exercise 9. To prove that the sum equals 
T use Theorem 9.8.) 


Let S € B(X). Prove that the following conditions are equivalent: 


(i) T,S commute: TS = ST. 
(ii) for all Borel sets 6 C C the operators F'(6), S commute. 
(iii) for all f € B(C) the operators f(T), S commute. 


(Hint for (i) ==> (ii): use Fuglede’s theorem (Exercise 13 in Chapter 7) 
to obtain that T*,S commute. Thus, every polynomial in 7 and in 
T* commutes with S. Deduce with the aid of Section 7.4 that f(T), S 
commute for all f € C(o(T)). This implies that E(5),S commute at least 
for certain Borel sets 06.) 


Continuing Exercise 18, assume that S' too is normal and has resolution 
of the identity F’. Prove that each of the following conditions is equivalent 
to T, S commuting: 

(iv) for all Borel sets 6,¢€ C C the projections E(6), F(€) commute. 

(v) for all f,g € B(C) the operators f(T), g(S) commute. 


10 


Unbounded operators 


This chapter develops the basic structure theory of unbounded operators, 
mostly on Hilbert spaces. This beautiful and deep theory is the work of 
many hands, including Carleman, Friedrichs, Hilbert, von Neumann, Riesz, 
Stone, and Sz.-Nagy. Naturally, unbounded operators are less well-behaved than 
bounded operators. Nevertheless, they “arise in nature”’—both mathematical 
and physical, for example, quantum mechanics—and they have enough structure 
to allow for intriguing results and applications. 

The first two sections provide some preliminaries, including the spectrum 
of unbounded operators and the Hilbert adjoint of densely defined unbounded 
operators on Hilbert spaces. 

The spectral theorem and the operational calculus for unbounded selfadjoint 
operators are established next. They are the perfect extensions to unbounded 
operators of the results on bounded operators in Chapter 9 carrying the same 
names. In contrast to the bounded case and for simplicity, we stick here to 
selfadjoint operators rather than normal ones. We also work out the theory of 
the semi-simplicity space of unbounded operators with real spectrum on Banach 
spaces. The method of proof of these results involves reducing to bounded 
operators. 

The chapter then discusses the theory of symmetric, and in particular 
selfadjoint, extensions of symmetric operators on Hilbert spaces. It has 
applications in differential equations and quantum mechanics, as well as other 
areas of research. 

The chapter ends with the subject of quadratic forms. We prove the 
representation theorem, linking between closed densely defined quadratic forms 
and positive selfadjoint operators. We also characterize closedness and closability 
of quadratic forms in terms of lower semi-continuity. These results are very 
helpful in operator theory since often it is more convenient to talk about the 
quadratic form associated to an operator than about the operator itself. 

In the exercises we continue the study of semigroups of operators begun in the 
exercises of Chapter 9, starting with the (generally unbounded) generator. We 


Introduction to Modern Analysis. Second Edition. Shmuel Kantorovitz and Ami Viselter, Oxford University Press. 
© Shmuel Kantorovitz and Ami Viselter (2022). DOI: 10.1093/os0/9780192849540.003.0010 


290 10. Unbounded operators 


also use quadratic forms to prove that every densely defined positive operator on 
a Hilbert space possesses a positive selfadjoint extension; the particular extension 
we construct is called the Friedrichs extension. 


10.1 Basics 


This chapter deals with (linear) operators T with domain D(T) and range R(T) 
in a Banach space X; D(T) and R(T) are (linear) subspaces of X. The operators 
S,T are equal if D(S) = D(T) and Sx = Tx for all x in the (common) domain 
of S and T. If $,T are operators such that D(S) C D(T) and T|p:s) = S, we 
say that T is an extension of S' (notation: S C T). 

The algebraic operations between unbounded operators are defined with the 
obvious restrictions on domains. Both sum and product are associative, but the 
distributive laws take the form 


AB+ACCA(B+C); (A+B)C=AC+ BC. 
The graph of T is the subspace of X x X given by 
I(T) := {[z,Ta];x € D(T)}. 


The operator T is closed if I(T) is a closed subspace of X x X. 
A convenient elementary criterion for T’ being closed is the following 
condition: 


If {a,} C D(T) is such that x, > x and Tz, > y, then x € D(T) and Tz = y. 


Clearly, if D(T) is closed and T is continuous on D(T), then T is a closed 
operator. In particular, every T € B(X) is closed. Conversely, if T’ is a closed 
operator with closed domain (hence a Banach space!), then T is continuous on 
D(T), by the closed graph theorem. Also if T is closed and continuous (on its 
domain), then it has a closed domain. 

If B € B(X) and T is closed, then T + B and TB (with their “maximal 
domains” D(T) and {x € X; Bx € D(T)}, respectively) are closed operators. In 
particular, the operators AJ — T and AT are closed, for any A € C. 

If B € B(X) is non-singular and T is closed, then BT (with domain D(T)) 
is closed. 


Usually, the norm taken on X x X is ||[z,y]|| = ||[z, yl] := lle] + llyll, or 
in case X is a Hilbert space, ||[z, y]|| = ||[z, y]|l2:= V/|lall? + |ly||?. These norms 


are equivalent, since 


II[z.a)llz < lz, alll, < V2lI[z, a}lle- 


If X is a Hilbert space, the space X x X (also denoted X © X) is a Hilbert space 
with the inner product 


([z, yl, [u, v]) = (x, u) ah, (y,v), 


10.1. Basics 291 


and the norm induced by this inner product is indeed ||[2, y]|| := ||[x, y]|l2- 
The graph norm on D(T) is defined by 


llzllr -= |l[z,T2]|| (@ € D(T)). 


We shall denote by [D(T)] the space D(T) with the graph norm. The space 
[D(T)] is complete iff T is a closed operator. 
If the operator S has a closed extension T, it clearly satisfies the property 


If {x,} C D(S) is such that x, > 0Oand St, > y, then y=0. 


An operator S with this property is said to be closable. Conversely, if S is 
closable, then the X x X-closure of its graph, ['(S), is the graph of a (necessarily 
closed) operator S$, called the closure of S. Indeed, if [a, y], [v, y’] € T'(S), there 
exist sequences {[%,,Sx,]} and {[z’,, Sx’,]} in [(S) converging respectively to 
[x,y] and [x,y/] in X x X. Then z, — x}, € D(S) > Oand S(a, -—21,) > y-y’. 


Therefore, y — y’ = 0 since S' is closable. Consequently, the map S : x > y is 
well defined, clearly linear, and by definition, 


T(S) =1(8). 


Hence the closable operator S$ has the (minimal) closed extension S. 

By definition, D(S) = {x € X;A{a,} C D(S) such that 2, > a2 and 
Jlim Sz,,} and Sx is equal to the shown limit for x € D(S). 

A core for a closed operator T is a linear subspace D of D(T) such that 
T|p = T; that is, '(T|p) is dense in I(T); equivalently, D is dense in [D(T)]. 
This means that to each x € D(T) there is a sequence {x,,} in D converging to 
x such that {Tx,} converges to Tx. 

If T is one-to-one, the inverse operator T~! with domain R(T) and range 
D(T) has the graph 


Re") Ssh), 


where J is the isometric automorphism of X x Y given by J|z,y] = [y, 2]. 
Therefore T is closed iff T~+ is closed. In particular, if T~! € B(X), then T is 
closed. 

The resolvent set p(T) of T is the set of all A € C such that AJ — T has 
an inverse in B(X); the inverse operator is called the resolvent of T, and is 
denoted by R(A;T) (or R(A), when T is understood). If p(T) #4 0, and A is 
any point in p(T), then R(A)~! (with domain D(T)) is closed, and therefore 
T = \I — R(\;T)~? is closed. On the other hand, if T is closed and ) is such 
that AJ — T is bijective, then \ € p(T) (because (AI — T)~! is closed and 
everywhere defined, hence belongs to B(X), by the closed graph theorem). 

By definition, TR(A) = AR(A)—I € B(X), while R(A)T = (AR(A) — 1) |pcr)- 

The complement of p(T’) in C is the spectrum of T, o(T'). By the preceding 
remark, the spectrum of the closed operator T is the disjoint union of the 
following sets: 


¢ the point spectrum of T, o,(T), which consists of all scalars A for which 
AI — T is not one-to-one. 


292 10. Unbounded operators 


¢ the continuous spectrum of T, o-(T), which consists of all A for which AT —T 
is one-to-one but not onto, and its range is dense in X. 


¢ the residual spectrum of T, o,(T), which consists of all A for which AJ — T 
is one-to-one, and its range is not dense in X. 


Theorem 10.1. Let T be any (unbounded) operator. Then p(T) is open, and 
R(-) is analytic on p(T) and satisfies the “resolvent identity” 


RA) — Rw) = (uw A)RANR(H) OAs € p(T). 


In particular, R(A) commutes with R(p). 
Moreover, ||R(A)|| > 1/dQ, o(T)). 


Proof. We assume without loss of generality that the resolvent set is non-empty. 
Let then \ € p(T), and denote r = ||R(A)||~!. We wish to prove that the disc 
B(),r) is contained in p(T). This will imply that p(T) is open and d(A, o(T)) > r 
(ie., ||R(A)|| 2 1/d(, o(T))). 

For  € B(A,r), the series 


converges in B(X), commutes with R(), and satisfies the identity 
(A= p)R(A)S(u) = S(q) ~ 1. 
For a € D(T), 
S(u)RO) (ul —T)x = S(u)RO)[(AL—T) — (A-p)]x = S(u)e— [S(u) — Ta = 
by the above identity, and similarly, for all 2 € X, 
(ul —T)R(A)S(u)e = ((AL—T) - (A- WIJ R(A)S (ue = S(w)a—[S(u) — a =o. 


This shows that y € p(T) and R(u) = R(A)S(s) for all uw € BO,r). 

In particular, R(-) is analytic in p(T) (since it is locally the sum of a B(X)- 
convergent power series). 

Finally, for A, u € p(T), we have on X: 


(AI — T)[R() — Rw) — (u — AVRO) R(u)] 
[A - w+ (ul — T)|R(u) — (u — A)R(p) 
—Q—pR(u) —I+ (A 2) R(u) = 0. 


Since AI — T is one-to-one, the resolvent identity follows. 


Theorem 10.2. Let T be an unbounded operator in the Banach space X, with 
AT) #0. Fir a € p(T) and let h(A) = 1/(a— A). Then h maps o(T) U {co} 
onto o(R(a)). 


10.2. The Hilbert adjoint 293 


Proof. (In order to reduce the number of brackets in the following formulae, we 
shall write Ry instead of R(A).) 

Taking complements in C U {co}, we must show that h maps p(T) onto 
p(R.) U {oo}. Since h(a) = oo, we consider A 4 a in p(T), and define 


V :=(a—A)[I+ (a—A)R)I. 


Then V commutes with Ah(\)I — Ra and by the resolvent identity (cf. 
Theorem 10.1) 


[A(A)I — RalV =I +(a—2)[Ry — Ra — (a —A)RaR] = I. 


This shows that h(A) € p(Ra) and R(h(A); Ra) = V. Hence h maps p(T) into 
p(Ra) U {oo}. 

Next, let w € p(Rq). If uw = 0, T = al — (al —T) = al — RZ! € B(X), 
contrary to our hypothesis. Hence, w 4 0, and let then X = a — 1/,p (so that 
h(A) = p). Let 

W := pRoR(u; Ra). 


Then W commutes with AJ — T and 
(AI —T)W = [(A—a)l + (al —T)|W = pA — a) Ra + I R(u; Re) 
= (ul — Ra) R(u; Ra) = I. 


Thus » € p(T), and we conclude that h maps p(T’) onto p( Ra) U {oo}. 


10.2 The Hilbert adjoint 


Terminology 10.3. Let T be an operator with dense domain D(T) in the Hilbert 
space X. For y € X fixed, consider the function 


o(@) = (Tx,y) (we D(T)). (1) 


If @ is continuous, it has a unique extension as a continuous linear functional on 
X (since D(T) is dense in X), and there exists therefore a unique z € X such 
that 

o(x) = (a,2) (we D(T)). (2) 
(Conversely, if there exists z € X such that (2) holds, then ¢ is continuous on 
D(T).) 

Let D(T*) denote the subspace of all y for which ¢ is continuous on D(T) 
(equivalently, for which ¢ = (-,z) for some z € X). Define T* : D(T*) — X by 
T*y = z (the map T™ is well defined, by the uniqueness of z for given y). The 
defining identity for T* is then 


(Tx,y) = («,T*y) (ee DZ), ye D(T")). (3) 


It follows clearly from (3) that T* (with domain D(T*)) is a linear operator. It 
is called the adjoint operator of T. 


294 10. Unbounded operators 


If S is another operator on X and T C S, then (S has dense domain and) 
evidently S* Cc T*. 
By (2), [y, z] € [(7*) iff (Tx, y) = (a, z) for all x € D(T), that is, iff 


([Tx,—2],[y,z]) =0 for all x € D(T). 
Consider the isometric automorphism of X x X defined by 


Qlz, yl = [y, —a). 


The preceding statement means that [y,z] € I'(T*) iff [y, z] is orthogonal to 
QI(T) in X x X. Hence 
T(T*) = (QLI(2))~. (4) 
In particular, it follows from (4) that T* is closed. 
One verifies easily that if B € B(X), then 


(T + B)* =T* + B* and (BT)* =T*B*. 


It follows in particular (or directly) that (AT)* = [(AI)T]* = AT*. 
If T = 7, the operator T is called a selfadjoint operator. Since T™ is closed, 
a selfadjoint operator is necessarily closed and densely defined. An everywhere 
defined selfadjoint operator is necessarily bounded by the closed graph theorem. 
The operator T is symmetric if 


(Tx,y) =(#,Ty) (a,y € D(T)). (5) 


If T is densely defined (so that T* exists), Condition (5) is equivalent to 
T CT*. If T is everywhere defined, it is symmetric iff it is selfadjoint. Therefore, 
a symmetric everywhere defined operator is a bounded selfadjoint operator. 

Selfadjoint operators are mazimal symmetric: if T, S are operators on X with 
T selfadjoint, S symmetric, and TC S, then SC S* CT* =T, thus T=S. 

If T is one-to-one with domain and range both dense in X, the adjoint 
operators T* and (T~+)* both exist. If T*y = 0 for some y € D(T*), then 
for all x € D(T) 

(Tx, y) = (x,T"y) a (x, 0) = 0, (6) 


and therefore y = 0 since R(T) is dense. Thus, T* is one-to-one, and (7*)~! 


exists. By (4) 
P((T~*)*) = (Qr(T~")|* = [QUT (1) 
= [-JQL(T)|* = J[QV(T)|* = JT(I*) =P(T*)*), 
since (JA)+ = JA+ for any AC X x X. Therefore 
(Ty =a) (7) 
It follows that if T is densely defined then 


R(A;T)* = RO;T*) (A € p(T). (8) 


10.2. The Hilbert adjoint 295 


In particular, if T is selfadjoint, 
R(A,T)* = R(;T), (9) 


and therefore R(A;T) is a bounded normal operator for each A € p(T) (cf. 
Theorem 10.1). 

Note that (6) also shows that for any T with dense domain, ker(T*) C R(T)+. 
On the other hand, if y € R(T)+, then (Tz,y) = 0 for all x € D(T). In 
particular, the function 7 > (Tx, y) is continuous on D(T), so that y € D(T"), 
and (x,T*y) = (Tx, y) =0 for all x € D(T). Since D(T) is dense, it follows that 
T*y = 0, and we conclude that 


ker(T*) = R(T)*+. (10) 


Theorem 10.4. Let T be a symmetric operator. Then for any non-real  € C, 
AI — T is one-to-one and 


|JAL-T)~*yll < SAP Iyll, (ye ROT -T)). (11) 


If T is closed, the range R(AI — T) is closed, and coincides with X if T is 
selfadjoint. In the latter case, every non-real X is in p(T), R(A;T) is a bounded 
normal operator, and 

| RO; T)|| < 1/|SAI. (12) 


Proof. If T is symmetric, (T'z,x) is real for all « € D(T) (since (Txz,x) = 
(x, Tx) = (Tx,x)). Therefore (Tx, iGx) is pure imaginary for 6 € R. Since al—T 
is symmetric for any a € R, ((al — T)z,i$xz) is pure imaginary for a, 6 € R. 
Hence, for all x € D(T) and AX=a+if, 


| AL— T)a|/? = ||(al — T)x + iBall? 
= (oI — Tal]? + 2R((al — T) x, iBx) + |x|)? 
= (al — Thal? + 6? |la|? = B*llel|’. 


Hence 
AL — T)zx|| = |SA| |I2I)- (13) 


If \ is non-real, it follows from (13) that AJ — T is one-to-one, and (11) holds. 
If T is also closed, (AJ —T)~! is closed and continuous on its domain R(AI — 
T) (by (11)), and therefore this domain is closed (for non-real X). 
If T is selfadjoint, 


R(AT — T)+ = ker((AI — T)*) = ker(AI — T) = {0} 
since \ is non-real. Therefore, (AJ — T)~! is everywhere defined, with operator 
norm < 1/|SA|, by (11). This shows that every non-real X is in p(T), that is, 
o(T) CR. 


We end this section with a few more properties of the adjoint. 


296 10. Unbounded operators 


Suppose that, in addition to being densely defined, T is closable. Then T” = 
T*, because since Q is an isometry we have QI(T) = QI(T) = QT(T), so 
(4) implies that [(T") = (QI (T))+ = (QI (T))+ = I(T*). Furthermore, T* 
is now also densely defined, because if w € D(T*)+ then [w,0] € T(7*)+ = 
(QI(T))*+ = QL(T) = Q(T) by the orthogonal decomposition theorem, which 
by the definition of Q means that [0,w] € I(T); since I(T) is the graph of an 
operator, this means that w = 0. 

So if the densely defined operator T is closable, its adjoint T* also has an 
adjoint, which we denote by T**. Using (4) again we deduce that 


nr) = qr(r"))* = (e(@r@)")) 


Since Q is a unitary equivalence of X x X with itself, we have Q(A+) = (QA)+ 
for all A C X x X. This and the fact that Q? = —Ix.x yields 

I(T) = QL (T) =T(7) =1 (7), 
proving that T** = T. In particular, if T is closed then T** = T. 

Conversely, if a densely defined operator T is such that T* is densely defined, 
then T must be closable, because if {z,} C D(T) converges to 0 and {Tz,} 
converges to u € X, then for y € D(T*) we have (Tzn,y) = (tn, T*y) for each 
n. The left-hand side converges to (u, y) while the right-hand side converges to 
0, so the density of D(T*) implies that u = 0 indeed. (Alternatively, we could 
use the reasoning of the previous paragraph to show that I(T) is the graph of 
an operator, namely, T**, thus T is closable.) 


10.3. The spectral theorem for unbounded 
selfadjoint operators 
Theorem 10.5. Let T be a selfadjoint operator on the Hilbert space X. Then 


there exists a unique regular selfadjoint spectral measure E on B := B(C), 
supported by o(T) CR, such that 


D(T) = {2 P xf, Vd E()zl2 < oo 


= {2 € X; lim / AdE(A)ax exists} (1) 
n+00 J_ a, 
and . 
Tx = lim AdE(A)x (a € D(T)). (2) 


—n 


(The limits shown here are strong limits in X.) 


Proof. By Theorem 10.4, every non-real a (to be fixed from now on) is in p(T), 
and Ra := R(a;T) is a bounded normal operator. Let F' be its resolution of the 
identity, and define 

E(d) = F(h(4)) (6 € B), (3) 


10.3. The spectral theorem for unbounded selfadjoint operators 297 


where h is as in Theorem 10.2. 
By Theorem 9.8 (Part 5), F({0})X = ker R, = {0}, and therefore 


EC) = #0} =f =P 0}) = 4. (4) 


We conclude that FE is a selfadjoint regular spectral measure from the 
corresponding properties of F’. 
By Theorem 10.2, 


E(o(T)) = F(R(o(T))) = F(o(Ra)) — FU0}) =F, 


hence E is supported by o(T) (by (4)). 
Denote the sets in (1) by Do and Dj. 
If 6 € B is bounded, then for all x € X, 


|  d|JEQ)E(6)2l|? = a] Y dl|EO)e|2 < co, (5) 
o(T) 6No(T) 


since \? is bounded on 6Noa(T). Hence E(5)X C Do. Moreover, by Theorem 9.6, 
the last integral in (5) equals || Jinecr) AdE(A)zx||?. For positive integers n > m, 
take 6 = [—n, —m] U [m,n]. Then 


2 


| @ \dE(A)a — ds \dE(A)x 


=f XalEQ@alP- fe alEOal?. 
It follows that Dp = Dj. 
Let « € D(T). We may then write « = Ray for a unique y € X, and therefore 


n n 


AdE(A)x = AdB(a) f wary 


—n —n 


my dB) fA) dE(A)y 
R 


—n 


= 1" NaC aBO)GS i An(A) dE (Ay. 
—n R 
(The limit exists in X because Ah(A) is bounded.) Thus, « € Do, and we proved 
that D(T) C Do. 
Next, let « € Do(= D1), and denote z = lim, f", \dE(A)x. Consider the 
sequence t,, := E([—n,n])x. Then zr, > x in X, 


tn =Ry | (a—A)dE(A)z € RyX = D(T), (6) 


—n 


and by (6) 


298 10. Unbounded operators 


Since al — T (with domain D(T)) is closed, it follows that « € D(T) and 
(al — T)a = ax — z. Hence Dp C D(T) (and so D(T) = Do), and (2) is valid. 
For each bounded 6 € B, the restriction of T to the reducing subspace E'(6)X 
is the bounded selfadjoint operator [, \dE(A). By the uniqueness of the resolution 
of the identity for bounded selfadjoint operators, E’ is uniquely determined on the 
bounded Borel sets, and therefore on all Borel sets, by Theorem 9.6 (Part 2). 


10.4 The operational calculus for unbounded 
selfadjoint operators 


The unique spectral measure E of Theorem 10.5 is called the resolution of the 
identity for T. 

The map f > f(T) := J, f dE of B := B(R) into B(X) is a norm-decreasing 
*-representation of B on X (cf. Theorem 9.1). The map is extended to arbitrary 
complex Borel functions f on R as follows. Let x, be the indicator of the set 
[|f| <n], and consider the “truncations” f, := fxn € B, n € N. The operator 
f(T) has domain D(f(T)) equal to the set of all  € X for which the strong 
limit lim, f,(T)x exists, and f(T) is defined as this limit for « € D(f(T)). 

Note that if f is bounded, then f, = f for all n > ||f\|., and therefore the 
new definition of f(T) coincides with the previous one for f € B. In particular, 
f(L) € B(X). For general Borel functions, we have the following. 


Theorem 10.6. Let T be an unbounded selfadjoint operator on the Hilbert 
space X, and let E be its resolution of the identity. For f : R — C Borel, let 
f(L) be defined as shown. Then: 


(a) D(f(L)) = {x € X; fa lf? UEC )2l|? < co}; 
(b) f(T) is a closed densely defined operator; and 
(c) f(T)* = f(Z). 


Proof. (a) Let D denote the set on the right-hand side of (a). 
Since fn(x) = f(a) for all n > |f(x)|, fn > f pointwise. If x € D, 


fn — fl? < Alf? € L*(\|E(-)2|l”) 


and | fn — fm|? + 0 pointwise when n,m — oo. Therefore, by Theorem 9.6 and 
Lebesgue’s dominated convergence theorem, 


Ilfn(L)2 — fin(L) all? = (fn — fn) (LP) 2|? = i fn — fml? al Z(-)a\|? > 0 


as n,m —> oo. Hence x € D(f(T)). On the other hand, if x € D(f(T)), we have 
by Fatou’s lemma 


f\sPalecal? <timint fp, ae? 
R R 


= lim inf || fn (T) all? = || f(L) all? < 00, 


10.4. The operational calculus for unbounded selfadjoint operators 299 


that is, x € D, and (a) has been verified. 
(b) Let x € X, and 6, = ||f| < nj. Clearly 65,, C 6% and ()é° = 0. Since 
||E(-)a||? is a finite positive measure, 


lim ||E(65,)2||? = le (Ma)e 


that is, 
lim ||c — E(6,)a||] =O (ae X). (1) 


Now 


if If/? dl EO) E(6n)al|? = | If[? dl EC)e|? < n2 Ila? <0, 
R On 


that is, E(6,)x € D(f(T)), by Part (a). This proves that D(f(T)) is dense in X. 

Fix « € D(f(T)) andm EN. Since E(4,,) is a bounded operator, we have by 
the operational calculus for bounded Borel functions and the relation x5, fn = fm 
for alln > ™m, 


E(6m)f(T)a = lim E(om) fn(L)x = lim fm(L)a = frn(T)a. (2) 


Similarly 
F(LT)E(Om)x = fm(T)e (a € X). (3) 


In order to show that f(T) is closed, let {x} be any sequence in D(f(T)) such 
that x, > x and f(T)r, > y. By (2) applied to x, € D(f(T)), 


E(bm)y = lim E(bm)f(T)tn = lim Fim(T)tn = fm(T)x, 


since fm(T) € B(X). Letting m — oo, we see that fm(T)x > y (by (1)). Hence 
x € D(f(T)), and f(L)« :=limm fm(T)x = y. This proves (b). 

(c) By the operational calculus for bounded Borel functions, (f,(T)z,y) = 
(x, fy(T)y) for all x,y € X. When x,y € D(f(T)) = D(f(T)) (cf. (a)), letting 
n — oo implies the relation (f(T)x,y) = (x, f(T)y). Hence f(T) C f(T)*. 
the other hand, if y € D(f(T)*), we have by (8) (for all « € X) 


(2, fm(T)y) = (fm(T)x,y) = (F (2) E(5m)2, 9) 


that is, 
fim(T)y = E(6m)f(T)*y- 


The right-hand side converges to f(T)*y when m —> oo. Hence, y € D(f(T)), 
and (c) follows. 


300 10. Unbounded operators 


10.5 The semi-simplicity space for 
unbounded operators in Banach space 


Let T be an unbounded operator with real spectrum on the Banach space X. Its 
Cayley transform 


V := (i -—T)G2 +T)"* = -2iR(-i;T) —I 


belongs to B(X). 
By Theorem 10.2 with a = —i and the corresponding h, 


o(R(—-i; T)) = h(o(T) VU {oo}), 
where oo denotes the point at infinity of the Riemann sphere. Therefore, 


o(V) = —2ih(o(T) U foo}) — 1 C —2ih(RU foo}) - 1 
= {arerbutner, 


where I denotes the unit circle. 


Definition 10.7. Let JT be an unbounded operator with real spectrum, and 
let V be its Cayley transform. The semi-simplicity space for the unbounded 
operator T is defined as the semi-simplicity space Z for the bounded operator V 
with spectrum in I (cf. Remark 9.13, (4)). 

The function : 
i-s 
i+s 


(8) = 


is a homeomorphism of R := RU {oo} onto I’, with the inverse @~!(\) = 
HLS A(T Nn. 

For any g € C(IR), we have g o ¢-! € C(T), and therefore, by Theorem 9.14, 
the operator (go ¢~')(V|z) belongs to B(Z), with B(Z)-norm < ||go ¢7"||oq@) = 
lalla): 

The restriction V|z is the Cayley transform of Tz, which is the restriction of 
T to the domain 

D(Tz) := {a € D(T)NZ;Ta € Z}. 


The operator Tz is called the part of T in Z. 
It is therefore natural to define 


(Tz) = (god ")(Viz) (ge C(R)). (1) 


The map Tt : g + g(Tz) is a norm-decreasing algebra homomorphism of C(R) 
into B(Z) such that fo(Tz) = I|z and ¢(Tz) = V|z. We call a map 7 with the 
shown properties a contractive C(R)-operational calculus for T on Z; when such 
T exists, we say that T is of contractive class C(R) on Z. 


10.5. The semi-simplicity space for unbounded operators in Banach space 301 


If W is a Banach subspace of X such that Ty is of contractive class C(R) 
on W, then the map 


fect’) > fVlw) = (f° ¢)(Tw) € BW) 


is a contractive C(I)-operational calculus for V|w in B(W) (note that (fi © 
o)(Tw) = (Tw) = Vlw). Therefore, W is a Banach subspace of Z, by 
Theorem 9.14. We formalize these observations as: 


Theorem 10.8. Let T be an unbounded operator with real spectrum, and let Z 
be its semi-simplicity space. Then T is of contractive class C(R) on Z, and Z is 
maximal with this property (in the sense detailed in Theorem 9.14). 


For X reflexive, we obtain a spectral integral representation for Tz. 


Theorem 10.9. Let T be an unbounded operator with real spectrum on the 
reflexive Banach space X, and let Z be its semi-simplicity space. Then there 
exists a contractive spectral measure on Z 


F: B(R) > BZ), 
such that 


(1) F commutes with every U € B(X) which commutes with T; 
(2) D(Tz) is the set Z of alla € Z such that the integral 


a——oo,b—-00 


[ sdF(s)e:= lim [sareoe 


exists in X and belongs to Z; 
(3) Ta = J, sdF(s)x for all x € D(Tz); and 
(4) For all non-real X € C and x € Z, 


R(A;T)x = | x : dF(s)x. 


R —S§ 
Proof. We apply Theorem 9.16 to the Cayley transform V. Let then E be the 
unique contractive spectral measure on Z, with support on the unit circle I, 


such that 
f(V\2)e= | faE()e (2) 


for alla € Z and f € C(T). 

If E({-1}) 4 0, each « 4 O in E({—1})Z is an eigenvector for V, 
corresponding to the eigenvalue —1 (the argument is the same as in the proof of 
Theorem 9.8, Part 5, first paragraph). However, since V = —2iR(—i;T) — J, we 
have the relation 

R(-i:T) = (i/2)(+V), (3) 


302 10. Unbounded operators 


from which it is evident that —1 is not an eigenvalue of V (since R(—i;T) is 
one-to-one). Thus 
E({—1}) = 0. (4) 


Define 

F(3) = E((4)) (6 € B(R)). 
Then F is a contractive spectral measure on Z defined on B(R) (note that the 
requirement F(R) = I|z follows from (4): 


LRH BP =i) Sh) Z0) 


If U € B(X) commutes with T, it follows that U commutes with V = 
—2iR(-i;T) — I, and therefore U commutes with E, hence with F’. By (2) 


iV ies i. foedF()a (5) 


for all x € Z and f € C(I). By definition, the left-hand side of (5) is (fo@)(Tz)x 
for f € C(L). We may then rewrite (5) in the form 


W(Ta)e= f gdF()e (eZ) (6) 
R 
for all g € C(IR). Taking in particular g = ¢, we get (since ¢(Tz) = V|z) 
a= [earn (x € Z). (7) 
By (3) and (7), we have for all « € Z 
iL )x = (i c= : 8)X 
R(-T)2 = 6/2) [0+ 0)aFO2 = ff — aF(s)e. (8) 
Observe that 
D(Tz) = R(-i;T)Z. (9) 


Indeed, if « € D(Tz), then x € D(T)N Z and Tz € Z, by definition. Therefore, 
gis (-il —T)x € Z, and « = R(-i;T)z € R(-i;T)Z. On the other hand, if 
x = R(-i;T)z for some z € Z, then x € D(T) MZ (because Z is invariant for 
R(-i;T)), and Tx = —ix — z € Z, so that x € D(Tz). 

Now let x € D(Tz), and write « = R(-i;T)z for a suitable z € Z (by (9)). 
The spectral integral on the right-hand side of (6) defines a norm-decreasing 
algebra homomorphism T of B(R) into B(Z), which extends the C(R)-operational 
calculus for T on Z (cf. Theorem 9.16). For real a < b, take g(s) = sxja,)(s) € 
(R). By (8) 


10.6. Symmetric operators in Hilbert space 303 


as a — —oo and b > oo (convergence in X of the last integral follows from the 
boundedness of the integrand on R). Thus, the integral f, sdF(s)x exists in X 
(in the sense stated in the theorem). Writing s/(—i—s) = [—i/(—i-—s)]—1, the 
last relation and (8) show that 


| sdF(s)x = -iR(-i;T)z — 2 =TR(-i;T)z =Tx € Z. (10) 
R 


This proves that D(Tz) C Z and Statement 3 of the theorem is valid. 

On the other hand, if « € Z, consider the well-defined element of Z given 
by z:= Jp sdF(s)x. Since R(—i;T) € B(X) commutes with T (hence with F) 
and x € Z, we have by (8) and the multiplicativity of 7 on B(R) 


b b 
R(-i;T)z = lim A sR(-i;T) dF(s)x = tim sdF(s)R(-i;T)x 
a+—co, boo Jag a,b Ja 
S 
= lim 5 os )a =| - dF(s)x 
a,b -i- R-i-s 


= i (= am i) dF (s)e = —-iR(-i;T)z — <x. 


Hence, « = —R(-i;T)(iz + z) € R(-i;T)Z = D(Tz), and we proved that 
D(Tz) = Zi. 7 

For any non-real \ € C, the function g)(s) := (A— s)~! belongs to C(R), so 
that g,(T'z) is a well-defined operator in B(Z) and by (6) 


o(Ta)e= f <X aP(s}0 (a € Z). (11) 


Fix « € Z, and let y := g)(Tz)a (€ Z). By the multiplicativity of 7 : B(R) > 
B(Z) and (10), 


b b 
[ sePew= | xy uF (s)e> f <2 aP(s)e 


-[(G57)) dPi\e Sty= ae Z. 


(The limit is the X-limit as a + —oo and b > on, and it exists because s/(A— s) 
is a bounded continuous function on R.) Thus, y € D(Tz) and Ty = Ay — x (by 
Statements 2 and 3 of the theorem). Hence, (AJ — T)y = x, and since A € p(T), 
it follows that y = R(A;T)x, and Statement 4 is verified. 


10.6 Symmetric operators in Hilbert space 


In this section, T’ will be an unbounded densely defined operator on a given 
Hilbert space X. The adjoint operator T™ is then a well-defined closed operator, 


304 10. Unbounded operators 


to which we associate the Hilbert space [D(T*)| with the T*-graph norm || - | 
and the inner product 


(x,y)" = (ay) + (T*2,T*y) (a,y € DT"). 
We also consider the continuous sesquilinear form on [D(T™)| 


o(@,y) = i[(@, Ty) — (T"a,y)| (ey € DIT"). 


Recall that T is symmetric iff T C T*. In particular, a symmetric operator T is 
closable (since it has the closed extension T*). If S is a symmetric extension of 
T, then TC SC S* C T*, so that S = T*|p, where D = D(S), and D(T) c 
Dc D(T*). Clearly ¢(2,y) = 0 for all ,y € D. (Call such a subspace D of 
[D(T*)] a symmetric subspace.) By the polarization formula for the sesquilinear 
form ¢, D is symmetric iff (x, x) (= 29(T*2,x)) =0 on D, that is, iff (T*x, x) 
is real on D. Since T* € B([D(T*)], X), the [D(T*)]-closure D of a symmetric 
subspace D is symmetric. 

If D is a symmetric subspace such that D(T) Cc D Cc D(T*), then D is the 
domain of the symmetric extension S$ := T*|p of T. Together with the previous 
remarks, this shows that the symmetric extensions S of T are precisely the 
restrictions of T* to symmetric subspaces of [D(T*)]. 

We verify easily that S is closed iff D is a closed (symmetric) subspace of 
[D(T*)| (Suppose S is closed and tz, € D > x in [D(T*)], ie., tp 3 x and 
S2,(= T*2,) > T*x in X. Since S is closed, it follows that « € D, and so D is 
closed in [D(T*)]. Conversely, if D is closed in [D(T*)], rz, > x and Sx, > y 
in X, then T*x, — y, and since T* is closed, y = T* a, ie., t, > x in [D(T*)]. 
Hence x € D, and Sx = T*x = y, ie., S is closed.) 

Let S' be a symmetric extension of the symmetric operator T. Since D(S) is 
then a symmetric subspace of [D(T*)], so is its [D(T*)|-closure D(S); therefore, 
the restriction of T* to D(S) is a closed symmetric extension of S, which is 
precisely the closure S of the closable operator S. (If x € D(S), there exist x, € 
D(S) C D(T*) such that 2, > x and Sz, + Sz. Since § C T*, we have tn 4 & 
in [D(T*)], hence « € D(S). Conversely, if « € D(S), there exist x, € D(S) 
such that #, + x in [D(T™)], that is, z, > x and Sx, (=T*2,) > T*x, hence 
a € D(S). This shows that D(S) = D(S), and S is the restriction of T* to this 
domain.) 

Clearly, S$ is the minimal closed symmetric extension of S, and S is closed iff 
a ae: 

Note that T’ and T have equal adjoints, since 

T(T") = (QU(T))~ = (QL (T))* = (QL(L))~ = (QP(Z))~ = TT"). 
(The | signs and the closure signs in the third and fourth expressions refer to 
the Hilbert space X x X.) 

Therefore, T and T have the same family of closed symmetric extensions 
(namely, the restrictions of T* to closed symmetric subspaces of [D(T™)]). 

We are interested in the family of selfadjoint extensions of T, which is 
contained in the family of closed symmetric extensions of T. We may then assume 
without loss of generality that T is a closed symmetric operator. 


| * 


10.6. Symmetric operators in Hilbert space 305 


By the orthogonal decomposition theorem for the Hilbert space [D(T*)], 
[D(T*)| = D(T)  D(T)-. (1) 


Definition 10.10. Let T be a closed densely defined symmetric operator. The 
kernels 
D* :=ker(I+iT*); D7 :=ker(I —iT*) 


are called the positive and negative deficiency spaces of T (respectively). Their 
(Hilbert) dimensions nt and n~ (in the Hilbert space [D(T™*)]) are called the 
deficiency indices of T. 


Note that 
D* = {y € D(I*);T*y =iy}; D7 = {y € D(T*);T*y = iy}. (2) 


In particular, (x, y)* = 2(x,y) on Dt and on D~, so that the Hilbert dimensions 
n* and n~ may be taken with respect to X (the deficiency spaces are also closed 
in X, as can be seen from their definition and the fact that T* is a closed 
operator). 

We have Dt 1 D~, because if x € D* and y € D~, then 


(x,y)* = (a, y) + (I*2,T*y) = (x,y) + (ia, —iy) = 0. 
If y € D*, then for all x € D(T), 


(x,y)" = (x,y) + (T"2,T"y) _ (x,y) + (Tx, iy) _ (2, y) + (x,iT*y) 
= (z,y) (x,y) = 0, 


and similarly for y € D~. Hence 


FSgeres @ Dace O16 at (3) 


On the other hand, if y € D(T’)--, we have 
O0=(2,y)" =(@,y)+(T2,T*y) (a € D(T)), 


hence, (Tx,T*y) = —(a,y) is a continuous function of x on D(T), that. is, 
T*y € D(T*) and T*(T*y) = —y. It follows that 


(J —iT*)(I +iT*)y = (7 +i7*)U —iT*)y =0. (4) 
Therefore 


y —iT*y €ker(I +iT*):= Dt; y tity €ker(I—iT*):=D-. 


Consequently 


y = (1/2)(y—iT*y) + (1/2)(y +iT*y) ¢ DT oD. 


306 10. Unbounded operators 


This shows that D(T)+ Cc D+ @ D~, and we conclude from (3) and (1) that 
D(T)+ =D*eD-, (5) 


and 
[D(T™*)| =D(T)@ Dt eD-. (6) 


It follows trivially from (6) that T is selfadjoint iff nt = n~ = 0. 

Let D be a closed symmetric subspace of [D(T*)] containing D(T). By the 
orthogonal decomposition theorem for the Hilbert space D with respect to its 
closed subspace D(T), D = D(T) @ W, where W = De D(T) := Dn D(T)+ 
is a closed symmetric subspace of D(T)+. Conversely, given such a subspace W, 
the subspace D := D(T) © W is a closed symmetric subspace of D(T*). By (5), 
the problem of finding all the closed symmetric extensions S' of T is now reduced 
to the problem of finding all the closed symmetric subspaces W of Dt @ D-. 
Let x,,k = 1,2 be the components of  € W in D* and D7 (4% = 21+ 22 
corresponds as usual to the element [21,22] € D+ x D~). The symmetry of D 
means that (T* x, x) is real on W. However 


(T*a, x) = (T* a1 + T* 29,21 + £2) = i(xy — £2,411 + x2) 


= i(|[ea||? — llzall?) - 23(a1, 22) 


is real iff ||x1|| = ||v2||. Thus, (7*z,x) is real on W iff the map U : 41 > 22 is 
a (linear) isometry of a (closed) subspace D(U) of D* onto a (closed) subspace 
R(U) of D~. Thus, W is a closed symmetric subspace of D(T)+ iff 


W = {[a1,U2,];21 € D(U)} 


is the graph of a linear isometry U as shown. (Note that since ||2||* = V2||2|| on 
D* and D~, U is an isometry in both Hilbert spaces X and [D(T*)].) 

Suppose D(U) is a proper (closed) subspace of D*. Let then 0 4 y € DTN 
D(U)+. Necessarily, y € D(S)+, so that for all x € D(S) 


0= (x, y)" ra (x,y) ah (Sx, T*y) = (a, y) o? i(Sz, y). 


Hence, (Sa,y) = —i(x,y) is a continuous function of x on D(S), that is, y € 
D(S*). Since 0 4 y € D(S)+, this shows that S 4 S*. The same conclusion is 
obtained if R(U) is a proper subspace of D~ (same argument!). In other words, 
a necessary condition for S to be selfadjoint is that U be an isometry of D* 
onto D~. Thus, if T has a selfadjoint extension, there exists a (linear) isometry 
of Dt onto D~ (equivalently, nt = n7). 

On the other hand, if there exists a (linear) isometry U of D* onto D-, 
define S' as the restriction of T* to D(S) := D(T) ®T(U). Since this domain 
D(S) is a closed symmetric subspace of D(T*) (containing D(T)), S is a closed 
symmetric extension of T. In particular, S C S*, and we have the decomposition 
(6) for S$ 

D(S*) = D(S) @ Dt (S) 6 D-(S). (7) 


10.7. Quadratic forms 307 


Since S* Cc T*, the graph inner products for S* and T* coincide on D(S*), 
D*t(S) c Dt, and D-(S) Cc D-. 

If S 4 S*, it follows from (7) that there exists 0 4 x € Dt(S) (or € D~(S)). 
Hence x + Ux € D(S) (or U-'x + x € D(S), respectively). Therefore, by (7), 
since Ux € D~ and x € Dt (U-!z € D* and x € D-, respectively), 


0=(2+Uza,2)* = (x,x)* = 2\|z||? > 0 


(0 = (U-'x+2,x)* = (x,x)* = 2\|a||? > 0, respectively), contradiction. Hence 
S = S*. 
We proved the following theorem of von Neumann. 


Theorem 10.11. Let T be a closed densely defined symmetric operator on the 
Hilbert space X. Then the closed symmetric extensions of T are the restrictions 
of T* to the closed subspaces of [D(I*)| of the form D(T) @T(U), where U is 
a linear isometry of a closed subspace of D* onto a closed subspace of D~, and 
[(U) is its graph. Such a restriction is selfadjoint if and only if U is an isometry 
of D* onto D~. In particular, T has a selfadjoint extension iffn* =n~ and has 
no proper closed symmetric extensions iff at least one of its deficiency indices 
vanishes. 


10.7 Quadratic forms 


We start with the following well-known result from linear algebra, given here 
without proof. 


Lemma 10.12. Let X be a complex vector space. 


1. There exists a 1-1 correspondence between the set of functions q(-,-) : 
X x X > C that are linear in the left variable and conjugate linear in the 
right variable, and the set of functions q(-): X > C satisfying 


g(ox) = |a|* q(2) 


Va, X,a€C). 1 
eiiaap=rumsagn Ore 


It given by 
q(t) =q(a,r) (Va Ee X) 


and 
1 3 
q(x.) = 4d a(e+ity) — (Va,y eX). 
k=0 


2. Let q(-) and q(-,-) be related as above. Then q(-) is real valued if and only if 
q(-,:) is Hermitian, namely, q(y,x) = q(x,y) for all x,y € X; and q(-) ts 
non-negative valued iff q(-,-) is positive semi-definite, namely, q(x,x) > 0 
for alla € X, that is: q(-,-) is a semi-inner product on X. 


308 10. Unbounded operators 


Definition 10.13. A quadratic form on a complex vector space X is a function 
q: X — C satisfying (1). The associated function X x X — C will be denoted 
by q(-,-) as shown. 


We will only treat non-negative quadratic forms, and thus omit this adjective. 


Definition 10.14. Let X be a Hilbert space. A quadratic form q on X is a 
quadratic form on a linear subspace D(q) of X. We say that q is densely defined 
if D(q) is dense in X. 


In the sequel, X is a Hilbert space. 


Example 1. A (generally unbounded) operator S on X induces a quadratic 
form on X given by ||S:||”, that is, its domain is D(S) and it maps x € D(S) to 
||Sa||°. 


Example 2. A (generally unbounded) operator S on X is called positive if 
(Sx,x) > 0 for all e € D(S). Such an operator induces a quadratic form on X 
with domain D(S) mapping x € D(S) to (Sz, 2). 


A quadratic form g on X induces an inner product (-,-), on D(qg) given by 
(x,y), :=(z,y)+a(z,y) = (x,y © D(q)). 


Equivalently, (-,-), is the inner product associated with the quadratic form || [P+ 
q(-) on D(q). The associated norm ||-||, clearly dominates ||-|). 


Definition 10.15. Let g be a quadratic form on X. 


* Say that q is closed if the inner product space (D(q),(-,-),) is complete. 
Equivalently, if whenever {x,,}°~_, is a sequence in D(q) converging in X to 
some « € X and satisfying q(ap, — %m) ———— 0, we have x € D(q) and 

n,m—oo 


q(an — %) ——> 0. 
noo 


¢ Say that q is closable if whenever {x,,}?-_, is a sequence in D(q) converging 
in X to 0 and satisfying q(x» — %n,) ———— 0, we have q(x,) ——> 0. 
n—->co 


n,m—-oo 


e An extension of gq is a quadratic form q’ on X that is an extension of q 
as a function, that is, D(q) C D(q') and q'|piq) = g. This means that 
(D(q),(-,*)q) is an inner product subspace of (D(q’), (-,+)q). 


Theorem 10.16. A quadratic form q on X is closable iff it admits a closed 
extension. In that case, it has a smallest closed extension q, called the closure of 
q. Its domain D(q) consists of all x € X for which there is a sequence {x }>~_, 
in D(q) converging to x and satisfying q(an — Um) errr; 0, in which case 


limn+oo g(an) exists and equals G(x). 


Proof. It is clear that if g admits a closed extension then it is closable. 
Conversely, assume that q is closable. We will express the completion of 
(D(q),(-,+),) a8 the Hilbert space coming from a quadratic form. 


10.7. Quadratic forms 309 


Let « € D(@) (the latter is defined in the theorem’s statement). Pick a 


sequence {x,}>-, in D(q) converging to x and satisfying g(a, — m) = 
n,m—oo 


0. Since q(-)!/? is a semi-norm on D(q), the triangle inequality shows that 
ce ee (ce al < (an — Lm)'/? for all n,m € N. Consequently, 
{q(xn)}7_, is a Cauchy sequence, so it converges. If {2/,}°°_, is another such 
sequence, then {z,, — z/,}°-_, converges to 0 and q ((tn — 2/,) — (am — a! ,))? < 
; ya 


q (an — Lm)? +q(e,-2zi, —> 0 by the triangle inequality, thus 


n,m—->oo 
g(&n — xi.) ar 0 as q is closable, from which it follows that limy 5.5 q(@n) = 
limn—+oo q(z',) by the triangle inequality again. 

To conclude, the extension g of q introduced in the theorem’s statement is 
well defined, and it is readily seen to be a quadratic form (in particular, D(@) 
is a linear subspace of X). Additionally, if  € D(g) and {x,,}°~, in D(q) is as 
above, then for every n € N, G(@n — ©) = limm—+oo (Un — Vm) by the definition 
of g, and since the right-hand side converges to 0 as n — ov, so does the left- 
hand side, proving that x, —~ Ft in (D(q), (-,-)z). Hence, D(q) is dense in the 
inner product space (D(q),(-,-)z) and every Cauchy sequence in (D(q), (-,-),) 
all of whose elements are in D(q) converges in (D(q),(-,-)z). This entails that 
(D(Q), (-, )a) is complete, namely, g is closed. Finally, ¢ is plainly the smallest 
closed extension of q. 


Definition 10.17. For a closed quadratic form q on X, a linear subspace D of 
D(q) is a core for q if q|p = q, equivalently: if D is dense in (D(q), (-,-),): 


Example 3. If S is a (generally unbounded) operator on X, then S is closed 
(resp., closable) iff the quadratic form ||S-||? of Example 1 is closed (resp., 
closable). If S' is closed, then a linear subspace D of D(S) is a core for S (that 
is, S]p = S) iff it is a core for the quadratic form ||S-||?. 


A (generally unbounded) selfadjoint operator T on X is positive iff o(T) C 
[0, 00). In this case, its square root T!/? makes sense by means of the operational 
calculus, and it is also a positive selfadjoint operator on X. See Exercises 12 and 
1D. 


Example 4. A positive selfadjoint operator T on X induces the closed densely- 
defined quadratic form gp := \|72/2-|)? on X (take S := T'/? in Example 1). 


We will now see that the construction of the previous example is exhaustive. 


Theorem 10.18 (The representation theorem). For every closed, densely 
defined quadratic form q on X there exists a unique positive selfadjoint operator 
T on X such that q = qr. In addition, D(T) is a core for q. 


Proof. Closedness of g means that the inner product space (D(q),(-,-),) is 
a Hilbert space. For x ¢€ X, the linear functional (-,x)|p(g) belongs to 
(D(q),(-,*)4)* and has norm at most ||2||, because for each y € D(q), the 


310 10. Unbounded operators 


Cauchy—Schwarz inequality yields |(y,x)| < |lyll \lz|] < |lyll, |lell. The “Little” 
Riesz representation theorem thus gives a unique vector Bx € D(q) such that 


(,y) = (Bry), (Vy € D(q)), (2) 


and we have ||Bz||, < |||. 

This construction produces a contractive (i.e., of norm at most 1) operator 
B:X — (D(q),(-,"),) satisfying (2) for all « € X. The positive definiteness of 
(-,-) and the density of D(q) in X imply that B is injective and has dense range 
in (D(q),(-,*),)- Treating B as an operator C : X + X, it is also contractive. 
For all « € X we have (x,Cx) = (Bx, Bx), > 0 by (2). Therefore, C is positive. 

Define T := C7! —I (with domain CX C D(q)). It is a positive selfadjoint 
operator as C' is positive and bounded. Using (2), for « € D(T’) and y € D(q) 
we have 


g(x,y) - (2,9)q a (x,y) = (CC~*z,y), _ (x,y) a (C~tz, y) _ (x,y) — (Ta, 9). 


In particular, for « in D(T) (which is contained in D(T!/?)) we have q(x) = 

(Tx,x) = (Pa Ts) = 72/22) = qr(«), proving that q| pcr) = @r|p7)- 
But D(T) is a core for both closed quadratic forms gr and q. Indeed, on 

one hand, D(T) is a core for T!/?, equivalently: for gr (see Exercise 15 and 

Example 3). On the other hand, the range of B, which equals D(T), is dense in 

(D(q),(,)q), that is: D(T) is a core for g. As a result, gr = q. 
The proof of uniqueness is left to the reader. 


Remark 10.19. Recall that if 7, S are selfadjoint operators on X and TC S, 
then T’ = S. In sharp contrast, it is possible for one closed, densely defined 
quadratic form on X to strictly extend another one; see Exercise 25. Extension 
of closed, densely defined quadratic forms does not entail extension of the positive 
selfadjoint operators associated to them by the representation theorem. 


A quadratic form gq : D(q) > [0,00) on a Hilbert space X can be extended to 
a function ¢: X — [0, co] assigning the value oo to all elements of X\D(q). Note 
that this extension satisfies (1) (with qin place of q) for all z,y € X andae€C 
because D(q) is a linear subspace of X. Conversely, if ¢ : X — [0,00] satisfies 
(1) for all z,y € X and a€ C, then D(q) := ¢~! (0, 00)) is a linear subspace of 
X and q|pcq) is a quadratic form on X. 

Recall from Exercise 6 in Chapter 3 that for a topological space X, a function 
f : X — [-00, ow] is called lower semi-continuous if it satisfies one of the following 
equivalent conditions: for each c € R, {x € X; f(x) > c} is open; for each net 
{Ca}aea in X converging to « € X we have f(x) < liminfye, f(#a); and for 
each net {ta} ¢,4 in X converging to x € X such that {f(ra)},¢,4 converges 
in [—00, co] we have f(x) < limgea f(rq). If X is first countable, nets can be 
replaced by sequences in the foregoing. 


Theorem 10.20. Let q be a quadratic form on X. Then q is closed (resp., 
closable) iff ¢ (resp., q) is lower semi-continuous. 


Exercises 311 


Proof. Assume that g (resp., g) is lower semi-continuous. Let {x,}°°., be a 

sequence in D(q) converging in X to some x € X (resp., to 0) and satisfying 

q(Ln — Lm) ———— 0. Since, for each n, the sequence {x,, — tm}>~_, converges 
n,m—oo 


to @, — x (resp., to x,y), lower semi-continuity of g (resp., q) implies that 


G(an — £) (resp., g(an)) is < lim inf g(a, —Zm) — 0. 


In particular, z belongs to D(q) as the latter is a linear subspace of X eventually 
containing x, — x. This proves that q(x», — x) ——> 0 (resp., ¢(%n) ——— 0), 
nN—-0o noo 
that is, q is closed (resp., closable). 
Assume that q is closable. Let q’ be a closed extension of g. Then q/ is lower 
semi-continuous by the next paragraph. So q = q’|p(q) is lower semi-continuous. 
Assume that q is closed. We can further assume that q is densely defined 


without loss of generality. By the representation theorem, g = gr = ||T/ 2.)|? 
for some positive selfadjoint operator T on X. Write E for its resolution 
of the identity. Recall that D(T'/?) consists of all « € X such that 


Taeses \d||E(A)a||? < 00, in which case 72/22) = fio,ce) \d||E(A)a||?. Hence, 
for each x € X, qr(x) equals Socey *UEO)2I’, which is the supremum 
of the ascending sequence of non-negative numbers { Sion A EO)27 = 
[7/25 ((0, n))a||? }°°,. Of course, T!/?E((0,n]) € B(X) for all n € N. For 
every bounded operator B € B(X), the function ||B-||? is continuous, thus lower 


semi-continuous. From all this and Part (d) of Exercise 6 in Chapter 3 it follows 
that qr is lower semi-continuous. 


Exercises 


The generator of a semigroup 


1. Use notation as in Exercise 14, Chapter 9. The generator A of the Co- 
semigroup T(-) is its strong right derivative at 0 with mazimal domain 
D(A): denoting the (right) differential ratio at 0 by Ap, that is, Ap := 
h-*[T(h) — I] (h > 0), we have 


Az= lim Apu «a € D(A) = {x © X;lim Anz exists}. 
a0+ h 


Prove: 
(a) Uso V@)X C D(A), and for each t > 0 and g € X, AV(t)x = 
T(t) — a. (Hint: Exercise 14(e), Chapter 9.) 


(b) D(A) is dense in X. (Hint: by Part (a), V(¢)x € D(A) for any t > 0 
and « € X and AV(t)x = T(t)x—«a. Apply Exercises 14(d) and 13(c) 
in Chapter 9.) 


(c) For « € D(A) and t > 0, T(t)” € D(A) and 
AT (t)x = T(t)Ax = (d/dt)T(t)a, 


312 10. Unbounded operators 


where the right-hand side denotes the strong derivative at t of u:= 
T(-)a. Therefore u : [0,00) —+ D(A) is a solution of class C1 of the 
abstract Cauchy problem (ACP) 


(ACP) u'=Au u(0) =z. 
Also ‘ 
| T(s)Ards=T(t)r-—a2 (x € D(A)). () 


(Hint: for left derivation, use Exercise 14(c), Chapter 9.) 


os, 
io 


A is a closed operator. (Hint: use the identity 

V(t)Ar = AV(t)a =T(the-—ax (x € D(A);t > 0) 
(cf. Part (a) and Exercise 13(c)), Chapter 9.) 
(e) If v : [0,00) + D(A) is a solution of class C! of ACP, then v = T(-)z. 
(This is the uniqueness of the solution of ACP when A is the generator 
of a Co-semigroup.) In particular, the generator A determines the 


semigroup T(-) uniquely. (Hint: apply Exercise 13(d), Chapter 9, to 
V := T(-)v(s — -) on the interval [0, s].) 


Semigroups continuous in the u.o.t. 


2. Use notation as in Exercise 1. Suppose T(h) + I in the wot. (ie., 
||T(h) — I|| > 0 as h > 0+). Prove: 


(a) V(h) is non-singular for h small enough (which we fix from now on). 
Define A := [T(h) — I]V(h)~! (€ B(X)). 


(b) T(t) — I = V(t)A for all t > 0 (with A as above). Conclude that 
A is the generator of T(-) (in particular, the generator is a bounded 
operator). 


CS 
ie) 
nae 


Conversely, if the generator A of T(-) is a bounded operator, then 
T(t) = e'4 (defined by the usual absolutely convergent series in 
B(X)) and T(h) — I in the u.o.t. (Hint: the exponential is a 
continuous semigroup (in the u.o.t.) with generator A; use the 
uniqueness statement in Exercise 1(e).) 


The resolvent of a semigroup generator 


3. Let T(-) be a Co-semigroup on the Banach space X. Let A be its 
generator, and w its type (cf. Exercise 14(f), Chapter 9). Fix a > w. 
Prove: 


(a) The Laplace transform 


Exercises 313 


converges absolutely (in B(X)) and ||Z(A)|| = O(1/(RA — a)) for 
RA > a (cf. Exercises 13(e) and 14(c), Chapter 9). 


(b) L(A)(AT — A) = @ for all e € D(A) and RA > a. 
(c) L(A)X Cc D(A), and (AI — A)L(A) =I for RA > a. 


(d) Conclude that o(A) c {A € C;RA < wh} and R(A; A) = L(A) for 
RA > w. 


(e) For any \, >a (kK =1,...,m), 


where M is a positive constant depending only on a and T(-). In 
particular 


[[Q.- oR Oui A)| <a (1) 
k 


||RIA)™ || < (A >a;meN). (2) 


(A—a)™ 
(Hint: apply Part (d), and the multiple integral version of 
Exercise 13(e), Chapter 9.) 


(f) Let A be any closed densely defined operator on X whose resolvent 
set contains a ray (a,oo) and whose resolvent R(-) satisfies || R(A)|| < 
M/(A — a) for \ > Ao (for some A» > a). (Such an A is sometimes 
called an abstract potential.) Consider the function A(-) : (a,oo) > 
B(X): 

A(A) := AAR(A) = M7 RA) — AL. 
Then, as \ > 00, 
lim A(\)z = Ax (a € D(A)); 
limAR(A) =I and limAR(A) =0_ in the s.o.t. 


Note that these conclusions are valid if A is the generator of a Co- 
semigroup, with a > w fixed (cf. Exercise 3, Parts (d) and (e)). 


4. Let A be a closed densely defined operator on the Banach space X such 
that (a,oo) C p(A) and (2) in Exercise 3(e) is satisfied. Define A(-) 


as in Exercise 3(f) and denote T(t) := e'4) (the usual power series). 
Prove: 
(a) ||Z)(¢)|| < Mexp(t(aX/(A — a)) for all \ > a. Conclude that 
IZ (t)|| < Me*** (A > 2a) (3) 
and 
lim sup ||T)(t)|| < Me. (4) 
A 00 


(b) If « € D(A), then uniformly for t in bounded intervals, 
lim —||7)(t)x — T,,(¢)2|| = 0. (5) 


2a<A,pU—-00 


314 


NS 


Ss 


NS 


10. Unbounded operators 


(Hint: apply Exercise 13(d), Chapter 9, to the function V(s) := T)(t— 
s)T,,(s) on the interval [0, t]; Exercise 1(c) to the semigroups T)(-) and 
T,,(-); Part (a), and Exercise 3(f)). 


For each « € X, {T)(t)x;X — co} is Cauchy (uniformly for ¢ in 
bounded intervals). (Use Part (b), the density of D(A), and (3) in 
Part (a)). Define then 


T(t) = lim T(t) 
A090 
in the s.o.t. Then T(-) is a strongly continuous semigroup such that 
\|T'(t)|| < Me and 


T(t)e-—2 = 7 T(s)Ards (a € D(A)). (6) 


(Hint: use (*) in Exercise 1(c) for the semigroup e'40), and apply 
Exercise 3(f)). 

If A’ is the generator of the semigroup T(-) defined in Part (c), then 
AC A’. Since AJ — A and AI — A’ are both one-to-one and onto for 
A >a and coincide on D(A), conclude that A’ = A. 


An operator A with domain D(A) C X is the generator of a 
Co-semigroup satisfying |/T(t)|| < Me for some real a iff it is 
closed, densely defined, (a,co) C p(A) and (2) is satisfied. (Collect 
information from earlier!) This is the Hille-Yosida theorem. In 
particular (case M = 1 and a = 0), Ais the generator of a contraction 
semigroup iff it is closed, densely defined, and AR(A) exist and are 
contractions for all \ > 0. (Terminology: the bounded operators A()) 
are called the Hille-Yosida approximations of the generator A.) 


Core for the generator 


5. Let T(-) be a Co-semigroup on the Banach space X, and let A be its 


generator. Prove: 


(a) T(-) is a Co-semigroup on the Banach space [D(A)]. (Recall that the 


norm on [D(A)] is the graph norm ||z||_4 := ||z|| + ||Az'].) 


(b) Let D be a T(-)- eee ae of D(A), dense in X. For each 


x € D, consider V(t)x := =i, ti s)a ds (defined in the Banach space D, 

the closure of D in [D(A)]). Given # € D(A), let x, € D be such that 
Ln — x (in X, by density of D in X). Then V(t), > V(t)x in the 
graph-norm. Conclude that V(t)a € D for each t > 0, and therefore 
a € D, that is, D is dense in [D(A)]. (Recall that a dense subspace 
of [D(A)] is called a core for A.) Thus a T(-)-invariant subspace of 
D(A) that is dense in X is a core for A. (On the other hand, a core 


Exercises 315 


D for A is trivially dense in X, since D(A) is dense in X and D is 
|| - || a-dense in D(A).) 


(c) A C%-vector for A is a vector « € X such that T(-)x is of class C™ 
(“strongly”) on [0,0o). Let D® denote the space of all C°°-vectors 
for A. Then 


Co 


De =(\) D(Aa": (7) 


(d) Let ¢, € CS°(R) be non-negative, with support in (0,1/n) and 
integral equal to 1. Given x € X, let x, = f dn(t)T(t)x dt. Then: 


(i) t > av in X; 
(ii) x, € D(A) and Az, = — f ¢/,()T(t)z dt; and 


(iii) an € D(A®) and A*a, = (—1)* f 6) (t)T(t)adt for all k EN. 
In particular, 7, € D°. 


Conclude that D® is dense in X and is a core for A (cf. Part (b).) 


The Hille—Yosida space of an arbitrary operator 


6. Let A be an unbounded operator on the Banach space X with (a, oo) C 
p(A), for some real a. Denote its resolvent by R(-). Let A be the multipli- 
cative semigroup generated by the set {(A — a)R(A);A > a}. Let Z := 
Z(A) (cf. Theorem 9.11), and consider Az, the part of Ain Z. The Hille- 
Yosida space for A, denoted W, is the closure of D(Az) in the Banach 
subspace Z. Prove: 


(a) W is R(A)-invariant for each A > a and R(\; Aw) = R(A)|w. In 
particular, Ay is closed as an operator in the Banach space W. 

(b) |RO; Aw) laa) < 1/(A— a@)™ for all A > a and m EN. 

(c) lim, AR(A; Aw)w = w in the Z-norm. Conclude that D(Ay) is 
dense in W. 

(d) Aw generates a Co-semigroup T(-) on the Banach space W, such that 
IIT(@)\laqwy) Se”. 

(e) If Y is a Banach subspace of X such that Ay generates a Co- 
semigroup on Y with the growth condition ||T(t)|| (vy) <e, then Y 
is a Banach subspace of W. (This is the mazimality of the Hille-Yosida 
space.) 


316 10. Unbounded operators 


Convergence of semigroups 


7. Let {T;(-);0 < s < c} be a family of Co-semigroups on the Banach space 
X, such that 
IT.) < Me*® (t>0;0<s<c) (8) 


for some M > 1 and a> 0. Let A, be the generator of T,(-), and denote 
T(-) = To(-) and A = Apo. Note that (8) implies that 


|RO;As)|| < M/(A-—a) (A> a;8 € [0,0)). (9) 


Fix a core D for A. We say that A, graph-converge on D to A (as s + 0) 
if for each a € D, there exists a vector function s € (0,c) > 7, € X such 
that x, € D(Ag) for each s and [x5, Asxs] > [x, Ax] in X x X. Prove: 


(a) A, graph-converge to A on D iff, for each \ > a and y € (AI — A)D, 
there exists a vector function s > ys such that [y,, R(A;As)y] > 
ly, R(A; A)y] in X x X (as s > 0). (Hint: y, = (AI — A,)a, and (9).) 
(b) If A; graph-converge to A, then as s > 0, R(A; As) > R(A; A) in 
the s.o.t. for all \ > a (the later property is called resolvents strong 
convergence). (Hint: show that (AI — A)D is dense in X, and use 
Part (a) and (9).) 
Conversely, resolvents strong convergence implies graph-convergence 
on D. (Given y € (AI — A)D, choose ys, = y constant!) 


— 
lo) 
NS 


(d) If 7’(-) is also a Co-semigroup satisfying (8), and A’ is its generator, 
then 


RA; AY)[T"(t)—T(Q)] RQ; A) = | T'(t—u)[R(A; A’)—R(; A)|T(u) du 


(10) 
for RA > a and t > O. (Hint: verify that the integrand in 
(10) is the derivative with respect to u of the function —T’(t — 
u)RA; ANT (u) RO; A).) 

(e) Resolvents strong convergence implies semigroups strong convergence, 
that is, for each 0 < T < on, 


ee I|T3(t)a — T(t)a'l| + 0 (11) 


as s — 0. (Hint: by (8), it suffices to consider x € D(A) = 
R(A; A)X. Write [T.(t) — T(t)| RO; A)y = RA; As)[Ts(t) — TO)]y + 
T,(t)[R(A; A) — RO; As)]y + [RO; As) — R(A; A)]T(t)y. Estimate the 
norm of the first summand for y € D(A) (hence y = R(A; A)x) 
using (10), and use the density of D(A) and (8)—(9). The second 
summand — 0 strongly, uniformly for t < 7, by (8)-(9). For the 
third summand, consider again y € D(A), for which one can use the 
relation T(t)y = yt fo T(u)Ay du; cf. Exercise 13(b), Chapter 9, and 
the dominated convergence theorem.) 


Exercises 317 


(f) Conversely, semigroups strong convergence implies resolvents strong 
convergence. (Hint: use the Laplace integral representation of the 
resolvents. ) 


Collecting, we conclude that generators graph-convergence, resolvents 
strong convergence, and semigroups strong convergence are equivalent 
(when Condition (8) is satisfied). 


Exponential formulas 


8. Let A be the generator of a Co-semigroup T(-) of contractions on the 
Banach space X. 
Let F : [0,00) — B(X) be contraction-valued, such that F'(0) = I 
and the (strong) right derivative of F(-)” at 0 coincides with Az, for all 
x ina core D for A. Prove: 


(a) Fix t > 0 and define A,, as in Exercise 21(f), Chapter 6. Then e’4» — 
F(t/n)” — 0 in the s.0.t. as n > oo. 


(b) s > e®4” is a (uniformly continuous) contraction semigroup, for each 
n € N (cf. Exercise 21(a), Chapter 6). 


Suppose T(-) isa contraction Co-semigroup. Asn — oo, the semigroups 
e®4n converge strongly to the semigroup T(s), uniformly on compact 
intervals (cf. conclusion of previous exercise (7); note that A,w > Ax 
for alla € D). Conclude that F(t/n)” — T(t) in the s.o.t., for each 
t>0. 


Let T(-) be a Co-semigroup such that ||T(¢)|| < e®, and consider 
the contraction semigroup S(t) := e~“T(t) (with generator A — al; 
a > 0). Choose F as follows: F'(0) = J and for 0 < s < 1/a, 


(c 


nN 


— 
Q 
wo 


F(s) := (s~' — a)R(s~'; A) = (s-' — a)R(s"' — a; A—al). 


Verify that F' satisfies the hypothesis stated at the beginning of the 
exercise, and conclude that 


ro) = tm, [rea] 1 


in the s.o.t., for each t > 0. 


Let T(-) be any Co-semigroup. By Exercise 14(c), Chapter 9, 
\|T'(t)|| < Me™ for some M > 1 and a > 0. Consider the equivalent 
norm 


— 
fo) 
ere 


|z| := sup e *||T(#)a|| (w@ € X). 
t>0 


Then |T(t)z| < e|z|, and therefore (12) is valid over (X,|-|), hence 
over X (since the two norms are equivalent). Relation (12) (true for 
any Co-semigroup!) is called the exponential formula for semigroups. 


318 


10. Unbounded operators 


(f) Let A,B,C generate contraction Cp-semigroups S(-),T(-),U(-), 


respectively, and suppose C = A+ Bon acore D for C. Then 
U(t) = lim [S(t/n)T(t/m)]" (> 0) (13) 


in the s.o.t. (Hint: choose F(t) = S(t)T(t) in Part (c).) 


Groups of operators 


9. A group of operators on the Banach space X is a map T(-): RR > B(X) 
such that 


T(s+t)=T(s)T(t) (s,t € R). 


We assume that it is of class Co, that is, the semigroup T(-)|[0,.0) is of 
class Co. Let A be the generator of this semigroup. Prove: 


(a) 
(b) 


The semigroup S(t) := T(—t),t > 0, is of class Cp, and has the 
generator —A. 


o(A) is contained in the strip 
Q:-w' < RA <w, 


where w, w’ are the types of the semigroups T(-) and S(-), respectively. 
Fix a > w anda’ >, and let 


OQ’ = {\EC;-a' < RA < a}. 
For \ €, 
M 
; A)” || < ——... 14 
IRAP S op (14) 
If A generates a bounded Co-group, then o(A) C iR and 


M 
. | eat 
IRA)" S By 


where M is a bound for ||T(-)||. 


An operator A generates a Co-group of operators iff it is closed, 
densely defined, has spectrum in a strip 2 as in Part (b), and (14) 
is satisfied for all real A ¢ [—a’',a]. (Hint: apply the Hille-Yosida 
theorem (cf. Exercise 4(e)) separately in the half-planes RA > a and 
RA > a’.) 


Let T(-) be a Co-group of unitary operators on a Hilbert space X. 
Let H = —iA, where A is the generator of T(-). Then H is a 
(closed, densely defined) symmetric operator with real spectrum. In 
particular, iJ — H and —il — H are both onto, so that the deficiency 
indices of H are both zero. Therefore H is selfadjoint (cf. (6) following 
Definition 10.10). 


Exercises 319 


(ec) Define ec” by means of the operational calculus for the selfadjoint 
operator H. This is a Co-group with generator iH = A, and therefore 
T(t) =e" (cf. Exercise 1(e): the generator determines the semigroup 
uniquely). This representation of unitary groups is Stone’s theorem. 


Unbounded operators on Hilbert spaces 


10. 


11. 


12. 


13. 


14. 


15. 


16. 


All operators are unbounded unless stated otherwise. 
Let X be a Hilbert space. Unless stated otherwise, T is a selfadjoint 
operator on X with resolution of the identity EF. 


Let f,g:R—-— C be Borel functions. Prove: 


(a) [f(Z)all? = Je lfP dl|E()a|/? for all « € D(f(T)). 

(b) (f(T), 9) = fy f A(E()2, y) for all « € D(f(T)) and y € X. 
(Hint: the equality is not difficult after proving that the integral exists. 
One way to do that is to show that the total variation of the complex 
measure (E(-)a,y) is bounded by the function ||E(-)2|| || E(-)y]].) 

(c) F(Z) + 9(7) C (Ff +9) (7). 

(d) f(D) 9(T) C (fg)(L) and D(F(L)g9(T)) = D((Fg9)(L)) 9 D(g(Z)). 

(e) For every n € N, T” equals f(T) where f(A) := X” for each X € R. 


Let f : R — R be Borel. Show that the resolution of the identity of the 
selfadjoint operator f(T’) maps a Borel set o C R to E(f~+(o)). 


Recall from Section 10.7 that an operator S$ on X is called positive if 
(Sx,x) > 0 for all x € D(S). Prove that the selfadjoint operator T is 
positive iff o(T) C [0,00) iff E is supported in [0, 00). 


Prove that T is injective iff the range of T is dense in X iff E({0}) = 0. 


Assuming that T is positive selfadjoint, prove that (Tz,x) > 0 for all 
04-2 € D(T) iff T is injective. In this case T is called strictly positive. 


Assume that T is a positive selfadjoint operator on X. The square root 
of T, denoted by T'/?, is f(T) for a Borel function f : R > R with 
f(A) = A? when A > 0 (the values of f(A) for \ < 0 do not matter). 
Prove: 


(a) T'/? is the unique positive selfadjoint operator on X whose square is 
T, namely (T!/?)? = T. (Hint: Exercises 10 and 11.) 

(b) D(T) C D(T/?) and D(T) is a core for T!/?. 

Let S be a closed densely defined operator on X. We will prove that S*S 


is a positive selfadjoint operator (in particular, it is densely defined!) and 
that D(S*S) is a core for S. See Exercise 26 for a different proof. 


320 


17. 


18. 


19. 


10. Unbounded operators 


(a) Prove: S*S' is positive, ((I + S*S)z,x) > (a,x) for all x € D(S*S), 
and I + S*S is one-to-one. 

(b) Recall from (4) of Section 10.2 that T'(S*) = (QI(S))+. Prove that 
the Hilbert space X x X equals QT(S) @T(S*). 

(c) Prove that for each w,z € X there exist unique elements x € D(S) 
and y € D(S*) solving the equations w = Sx+y and z = —x+4+ S*y. 

(d) Prove that for each z € X there exist unique elements Bz € D(S) 
and Cz € D(S*) such that SBz = Cz and z = Bz+ S*Cz. 

(e) Part (d) constructs functions B,C : X > X. Prove: B,C are linear 
operators, || Bl, ||C]| < 1, B is positive, and J = (I+ S*S')B. The last 
equality together with Part (a) imply that B = (I + $*S)71. 

(f) Use the selfadjointness of B to conclude the selfadjointness of S*S. 

(g) Prove that D(S*S) is a core for S. (Hint: assume by contradiction 
that this is false, ie., that I(S|pcs*s)) is not dense in I(S). 
Let [z, Sa] € T(S) be orthogonal to P(S|p:s+s)). Show that x is 
orthogonal to (I + S*S)X = X.) 


Let S be a closed densely defined operator on X. By Exercise 16, S*S is 
a positive selfadjoint operator and D(S*S) is a core for S. The absolute 
value of § is the positive selfadjoint operator |S| := (S*S)1/?, where the 
right-hand side is given by Exercise 15. Prove that D(|S|) = D(S) and 
|||S| a|| = ||Sa|| for all a in the common domain. (Hint: prove this first for 
x € D(S*S) and then employ D(S*S) being a core for both |S| and S.) 


Let B € B(X). Prove that the following conditions are equivalent; when 
they hold we say that T, B commute: 


(i) BT CTB; 

(ii) for all Borel sets 6 C R the bounded operators E(4), B commute; 
) for all f € B(R) the bounded operators f(T’), B commute; 

) 


for some (resp., all) A € p(Z’) the bounded operators R(A;T), B 
commute; and 


(iii 


(iv 


v) for all t € R the unitary operator e’? commutes with B. 
y Op 


Show also that in this case BT is closable and BT = TB. 

(Hints: use Exercise 18 of Chapter 9 to prove that (iv) => (ii). Use 
the following fact to prove that (v) == (ii): the function that maps a 
regular complex Borel measure jz on R to its Fourier—Stieltjes transform 
ji: R > C given by fi(y) := fe duu(z) is injective (see Section II.3.3).) 


Commutativity of two unbounded operators is not trivial to define. Let 
T,S be selfadjoint operators on X with resolutions of the identity E, F, 
respectively. 


(a) Prove that the following conditions are equivalent; when they hold 
we say that T,S commute (strongly): 


Exercises 321 


20. 


21. 


22. 


23. 


(i) for all Borel sets 6,¢ C R the projections E(0), F'(e) commute; 
(ii) for all f,g € B(R) the bounded operators f(T), g(S) commute; 
(iii) for some (resp., all) A € p(T) and pw € p(S) the bounded 

operators R(\;T) and R(u; S$) commute; and 
itT ,isS 


(iv) for all t,s € R the unitary operators e* ,e'*° commute. 


(b) Prove that if T,S commute, then so do the (selfadjoint, generally 
unbounded) operators f(T), g(S) for all Borel functions f,g:R — R. 


Assume that the selfadjoint operators T, S on X commute. Prove that: 


(a) TS admit a common core: there exists a dense subspace D of X that 
is contained in D(T’)M D(S) and such that T|p =T and S|p =S. 


(b) If T,S agree on a dense subspace of X (contained in D(T) NM D(S)), 
then T= S. 


(c) T+S,TS,ST are closable, their closures are selfadjoint and commute 
with one another and with T and S, and TS = ST. 


Let T be a positive densely defined operator on X. Assume that TD(T) C 
D(T) and that I+T is bijective as a map from D(T)) to itself. Prove that 
T is selfadjoint. (Hint: show that (I + T)~! is bounded and positive.) 


A conjugation on X is a conjugate-linear map J : X — X such that 
(Jz, Jy) = (y,2) for all z,y € X and J? =I. 

Prove that a closed densely defined symmetric operator T on X that 
commutes with some conjugation J in the simplest sense that TJ = JT 
has equal deficiency indices, and thus has a selfadjoint extension. 


Let T be a closed densely defined operator on X. Let {ta} ¢,4 be a net 
in D(T) that converges weakly in X to some « such that {||Tal|},¢, is 
bounded by some C' > 0. Prove that « € D(T) and ||Tz|| < C. 


Quadratic forms 


24. 


25. 


Let X denote a Hilbert space. 


Let S be a densely defined positive operator on X. Consider the densely 
defined form gq associated to S as in Example 2 of Section 10.7, that 
is, D(qg) = D(S) and q(x) = (Sa,x) for all « € D(S'). Show that q is 
closable, hence there exists a positive selfadjoint operator T’ on X such 
that ¢ = qr. Prove that T is an extension of S. It is called the Friedrichs 
extension of S. This exercise proves that every densely defined positive 
operator possesses a positive selfadjoint extension. 


(Continuing Exercise 24.) Let S be a densely defined positive operator on 
X, let T be the Friedrichs extension of S, and let T’ be another positive 
selfadjoint extension of S. Prove that gz extends gr. (Remark: generally 
T’ does not extend T, for that is equivalent to T = T’; cf. Remark 10.19.) 


322 


26. 


27. 


28. 


29. 


10. Unbounded operators 


(A different approach to Exercise 16.) Let S be a closed densely defined 
operator on X. Consider the closed densely defined form ||,S-||? associated 
to S as in Example 1 of Section 10.7, and let T be the positive selfadjoint 
operator on X such that ||S-||? = qr (see the representation theorem 
10.18). Prove that T = S*S' and that D(S*S) is a core for S. 


(a) Let qi,q2 be closed quadratic forms on X. Prove that q, + q2, with 
maximal domain D(q,)M D(q2), is a closed quadratic form on X. 

(b) Let A, B be positive selfadjoint operators on X such that D(A'/?)n 
D(B'/?) is dense in X. Prove that there exists a unique positive 
selfadjoint operator C' on X such that D(C'/?) = D(A/?)n D(B!/?) 
and for every x in this subspace we have ||C!/2z||?_ = || A‘/22||? + 
|| B'/2||?. The operator C is called the quadratic form sum of A and 
B. 


For quadratic forms qi,qg2 on X, write q@ < q if D(q2) C D(q.) and 

q(x) < qe(a) for every x € D(q2). This is equivalent to qi < q. For 

positive selfadjoint operators A;, Ag on X, write Ay < Ao if qa, < daz, 

that is, D(A3/2) Cc D(A}/?) and ||Aj/?2|| < ||A3/?2l| for every x € 
1/2 

D(A3!?). 


(a) Let {g,}>~, be an ascending sequence of quadratic forms. Prove that 


D(q) := {2 é () D(qn); lim dn(x) exists in 0-0} 


n=1 


is a linear subspace of X, and that the pointwise limit gq : D(q) > 
(0,00), ¢(-) := limn—+oo Gn(-), is a quadratic form. Note that q(-) = 
limn—+oo Gn(-). 

(b) Suppose that {g,}°°, as in 28(a) consists of closed forms. Prove that 
q is also closed. 

(c) Let {An}°, be an ascending sequence of positive selfadjoint 
operators on X. Suppose that 


CO 

D:= " € {) D(A); lim ||A}/? || exists in 0.0} 

n—->oco 
n=1 

is dense in X. Prove that there exists a unique positive selfadjoint 

operator A on X such that D(A'/?) = D and for every «x in this 

subspace we have ||A!/2z]] = limn—oo || An/72'||. 


Take X := L?(R) with respect to the Lebesgue measure. Define q : 
C.(R) — [0,c0) by g(f) := |f(0)’, f € C(I). Prove that q is a 
non-closable, densely defined quadratic form on X. 


11 


C*-algebras 


This chapter continues the discussion on Banach algebras started in Chapter 7 
and is devoted exclusively to C*-algebras, whose theory is vast and interacts 
with numerous areas of mathematics. 

One of the highlights of Chapter 7 was the commutative Gelfand—Naimark 
theorem (7.16), roughly saying that every commutative C*-algebra “is” Co(Q) 
for some locally compact Hausdorff space 2. This is the first of many reasons 
why the theory of general C*-algebras is often referred to as “non-commutative 
topology”. The reader should bear this in mind throughout the chapter. One of 
the highlights here is the non-commutative Gelfand—Naimark theorem (11.26), 
which roughly says that every (“abstract”) C*-algebra “is” a C*-subalgebra of 
B(X) for some Hilbert space X (this is a “concrete” C*-algebra). 

The chapter is structured as follows. After fixing the notation and 
conventions, we proceed to exploit further the continuous operational calculus 
of normal elements introduced in Section 7.4. One remarkable and useful result 
is that every *-homomorphism between C*-algebras is automatically continuous 
(even contractive), and that it is isometric if it is injective. Such results, in which 
algebraic properties imply analytic properties, are particularly satisfying. 

Positivity, of elements and of functionals, is among the foundations of 
C*-algebra theory. A selfadjoint element of a C*-algebra with non-negative 
spectrum is called positive. These elements have various interesting properties. 
For instance, they form a cone and admit (positive) square roots. Moreover, 
every element of the form x*z is positive—a key fact in C*-algebras. We show 
that every C*-algebra possesses an increasing contractive approximate identity 
consisting of positive elements. 

Approximate identities, in turn, are employed to establish two facts about 
(closed) ideals in C*-algebras: first, that they are automatically selfadjoint, thus 
C*-algebras by themselves; and second, that the quotients are also C*-algebras. 

Recall that by the Riesz representation theorem, the space Co(Q)* of all 
bounded linear functionals on Co(Q), where 2 is a locally compact Hausdorff 
space, is isometrically isomorphic to the space of regular complex Borel measures 


Introduction to Modern Analysis. Second Edition. Shmuel! Kantorovitz and Ami Viselter, Oxford University Press. 
© Shmuel Kantorovitz and Ami Viselter (2022). DOI: 10.1093/os0/9780192849540.003.0011 


324 11. C*-algebras 


on 2. Hence, the bounded linear functionals on an arbitrary C*-algebra should 
be viewed as a non-commutative version of complex regular Borel measures. 

A linear functional on a C*-algebra is called positive if it maps positive 
elements of the algebra to non-negative numbers. Such functionals are the 
non-commutative version of regular finite positive Borel measures. It turns out 
that every positive functional w is automatically continuous and that its norm 
satisfies ||w|| = lim, w(e,) for every approximate identity {e, }; and conversely, a 
bounded linear functional satisfying this equality for some approximate identity 
is positive. 

A positive linear functional of norm 1 is called a state. This terminology, 
courtesy of I. E. Segal, is taken from quantum mechanics. States are plentiful: 
there are enough of them to compute the norm of every element of the algebra. 

Representations of C*-algebras on Hilbert spaces are of the utmost 
importance. As in other areas of mathematics, their role is to match 
concrete objects (here, bounded operators on some Hilbert space) to abstract 
objects (here, elements of the C*-algebra). The Gelfand—Naimark-Segal (GNS) 
construction assigns to every state of a C*-algebra A some cyclic representation 
of A. This construction and the richness of states in C*-algebras imply the 
non-commutative Gelfand—Naimark theorem: every C*-algebra has a faithful 
representation on some Hilbert space! This cardinal theorem of Gelfand and 
Naimark was the starting point of the whole theory of C*-algebras. 

In the final section we study convexity in positive linear functionals on 
C*-algebras. The set of states of a C*-algebra is convex and its extremal 
points are called pure states. The representation assigned to a state by 
the GNS theorem is irreducible iff the state is pure. It also presents two 
decomposition theorems, which, loosely speaking, are non-commutative versions 
of two results/constructions on measures: the Jordan decomposition of a real 
measure and the total variation measure of a complex measure (see Section 1.9). 

Last, a word about terminology is in order. What we presently call a 
C*-algebra was initially called a B*-algebra (after Banach). Back then, a C*- 
algebra was a B*-algebra satisfying the extra assumption that x*x is positive for 
every x. Gelfand and Naimark conjectured that this assumption was redundant. 
Kaplansky proved their conjecture based on results of Fukamiya and of Kelley 
and Vaught. The first edition of this book used the old term “B*-algebras”. 


11.1 Notation and examples 


Recall that a C*-algebra is a Banach algebra A with involution x > x* satisfying 
the so-called C*-identity: 


l|z*xl| = |[el|? (Wa € A). 


A C*-algebra need not be unital; if it is, we denote its unit by 1 (in contrast to 
Chapters 7 and 9, where the unit was denoted by e), and it is then selfadjoint 
(1* = 1) and of norm 1. The unitization of a C*-algebra A (see Section 7.3) is 
again denoted by A*, and it equals A if the latter is unital. 


11.2. The continuous operational calculus continued 325 


Let us start with some examples and non-examples. 
Examples. We continue the examples from the beginning of Section 7.1. 


1. For a locally compact Hausdorff space Q, Cp(Q) and Cy(Q) are commutative 
C*-algebras with respect to the pointwise operations, including pointwise 
complex conjugation as involution, and the uniform norm. So is L°(Q, A, ju) 
for a measure space (Q, A, w) with respect to the pointwise operations and 
the D°-norm. 


2. For a Hilbert space X, the Banach algebra B(X) of all bounded linear 
operators on X is a C*-algebra with the Hilbert adjoint as involution. The 
set K(X) of compact operators on X is a closed subalgebra (indeed, an 
ideal) of B(X) and it is also closed under the adjoint operation. Hence, it 
is a C*-algebra of B(X). 


3. The Banach algebra A(D) admits an involution given by f*(z) := f(z) 
(f € A(D), z € D), but it is not a C*-algebra. Indeed, the function f € A(D) 


given by f(z) := z+4, z €D, satisfies || f*fl]| = 2 = |[fll, so "fll # IP. 


Henceforth we write A,, for the subset of a C*-algebra A consisting of its 
selfadjoint elements; it is a real subspace of A. We also write 1 for 14% (even 
when A is not unital) for brevity. This should not lead to any confusion. 

In this chapter and Chapter 12, an element p of a C*-algebra A is called a 
projection if it is a selfadjoint idempotent: p? = p = p*. In particular, when A C 
B(X) for a Hilbert space X, a projection P in A is what we called in Section 8.2 
an “orthogonal projection” (characterized by the ranges PX and (I— P)X being 
orthogonal). For a Hilbert space X and a closed subspace Xo C X, the unique 
projection in B(X) whose range is Xo is called the projection of X onto Xo. 


11.2 The continuous operational calculus 
continued 


This section presents some further results pertaining to the continuous 
operational calculus of normal elements in C*-algebras. Let A be a C*-algebra. 
Recall that the operational calculus for a normal element x € A is the (unique) 
unital «isomorphism from C(a(zx)) onto [a] (:= the unital C*-subalgebra of A# 
generated by x and 1) mapping the identity function f; to x. 

First, if A is unital, then every element of A is a linear combination of at 
most four unitary elements of A. Indeed, let first x € A,,. We may assume that 
I|x|| < 1, so that o(x) C [-1,1]. Define f € C(a(x)) by f(A) = A+iV1— 2, 
d € o(z). Then ff =1= ff identically and $(f+/f) = fi. Since the operational 
calculus is a unital + homomorphism it follows that u := f(a) satisfies wu* = 
1 = u*u, that is, u is unitary, and $(u+u*) = 2. This completes the proof 
because every element of A can be written as x + iy with r,y € Aga. 

Next, normal elements of unital C*-algebras possess two operational calculi: 
continuous and analytic (Section 9.10). They coincide: 


326 11. C*-algebras 


Theorem 11.1. Suppose that A is a unital C*-algebra and a € A is normal. 
For every f € H(a(a)), the two meanings assigned to the symbol f(a), namely, 
the analytic and the continuous operational calculi, coincide. 


Proof. Denote by 7 the continuous operational calculus of a, which is a *- 
isomorphism from C(a(a)) onto [a]. Observe that for each » € p(a), R(A;a) 
equals the image under 7 of the function py — (A— p)~', thus R(A;a) € [a] 

Let Q be an open neighborhood of o(a) in which f is analytic, let T € 


T'(a(a),Q), and write 
, b= 55 [FOR 


We should prove that b = 7(f). From the first paragraph of the proof it follows 
that b € [a]. From the continuity of 7~! we get 


1 eed 
= 5, [fer (R(A;a)) dd 


(the integral converges in C(o(a))), so for every ps € o(a), 
1 


Qi 


= dq 
Qi [i A= pb A= flu) 
by Cauchy’s theorem. Thus b = 7(f), as desired. 


(7~*())(H) = heir *(R(A; a))](u) dA 


We end this section with the following surprising and fundamental result, 
which says that the algebraic properties of being a *-homomorphism between 
two C*-algebras and, respectively, being an injective one, imply the analytic 
properties of being contractive and, respectively, isometric. Such results are 
particularly satisfying. We later improve this result further in Theorem 11.14. 


Theorem 11.2. Let A,B be C*-algebras and p : A > B be a 
*x-homomorphism. 


(i) yg is contractive (|\yp|| < 1); in particular, p is continuous. 


(ti) If p is injective, then it is isometric. Consequently, p(A) is a C*- 
subalgebra of B. 


Proof. We let the reader explain why we may assume that A, 6 are unital and 
so is y. This implies that for every a € A, op(y(a)) C o(a), and if vy is an 
isomorphism from A onto B, then og(yp(a)) = o.4(a). 

(i) Let a € A. By normality of a*a and y(a*a), Lemma 7.15 implies that 
llall? = |la*al| = ra(a*a) and ||g(a)||" = |le(@)*e(@)|| = Ile(@*a)|| = rs (v(a*a)). 
Since og(p(a*a)) C o4(a*a), we obtain rg(y(a*a)) < r4(a*a), so ||y(a)|| < lal]. 

Before proceeding, notice that if c € A is normal and f € C(a(c)), 
then y(f(c)) = f(y(c)). Indeed, this holds for f being a polynomial 


11.3. Positive elements 327 


fA) = SagjA*(A)) because y is a *homomorphism, and then for every 
f € C(o(c)) by approximation and the continuity of y. 

(ii) Assume that y is injective. Let c € A be normal. We already know 
that og(y(c)) C oa(c). If strict inclusion occurs, then by Urysohn’s lemma 
(Theorem 3.1) there is a non-zero f € C(a,4(c)) vanishing on og(y(c)). Thus, 


f(c) 4 0 (by injectivity of the operational calculus) 
but 9(f(c)) = f(y(e)) =9, 


in contrast to y being assumed injective. Thus og(y(c)) = o.4(c) whenever c € A 
in normal. 

Finally, for a € A we have og(p(a*a)) = o4(a*a), so rg(y(a*a)) = ry(a*a) 
and ||y(a)|| = ||a|| as in the proof of (i). 


11.3. Positive elements 


Positivity plays an essential role in C*-algebra theory. We introduce two forms 
of positivity: of elements of C*-algebras in this section and of functionals on 
C*-algebras in Section 11.6. Let A be a C*-algebra. 

An element x € A is called positive if it is selfadjoint and o(x) C Rt := 
[0,co). Denote by A+ the set of all positive elements of A. 

If B is a C*-subalgebra of A* containing x and 1, then z is positive in A iff 
it is positive in B by Theorem 7.18. We later strengthen this assertion. 

Since the operational calculus for a normal element x is a *-isomorphism of 
C(a(x)) and [2] (= the unital C*-subalgebra of A* generated by x and 1), the 
element f(a) is selfadjoint iff f is real on a(x), and by the spectral mapping 
theorem (Theorem 7.20), it is positive iff f(o(x)) C R*, that is, iff f > 0 on 
a(x). In particular, for x selfadjoint, decompose the real function f(A) = A 
(A € o(z) CR) as f, = fi — f,, so that e = 2+ — 2~, where zt := fj (x) and 
x := f; (x) are both positive elements of A (since f;*, f, > 0 on o(a)) and 
atx =a-xt =0 (since ff; =0 on o(z)). We call «+ and x~ the positive 
part and the negative part of x respectively. 

Since every element of A can be written as x +iy with x,y € Asa, we deduce 
that every element of A can be written as the linear combination of at most 4 
elements of A,. 

If « € Ay,o(x) C [0,||z||] and ||z|| € o(x) (cf. the last paragraph of 
Section 7.3). 

For a real scalar a > ||x||, a1 — x is selfadjoint, and by Theorem 7.20, 


a(al — x2) =a—o(x) Ca — (0,||a]|] = [a — ||a||, a] C [0, a]. 
Therefore, by Lemma 7.15, 


jal —2||=r(al—az)<a (1) 


328 11. C*-algebras 


(the norm is in A¥). Conversely, if x is selfadjoint and (1) is satisfied, then 
a—o(x) =a(al — 2x) C [-a,al, 


hence, 
a(x) C a+ [—a, a] = [0, 2a], 


and therefore x € A;. This proves the following 


Lemma 11.3. Let x € A be selfadjoint and fiz a > ||x||. Then x is positive iff 
lal — a|| <a. 


Theorem 11.4. Let A be a C*-algebra. A, is a closed positive cone in A (i.e., a 
closed subset of A, closed under addition and multiplication by non-negative 
scalars, such that Ay M(—A+) = {0}). 


Proof. We may assume without loss of generality that A is unital. Let {rp }°2, 
in A, converge to x € A. Then z is selfadjoint (because x, are selfadjoint and 
the involution is continuous). By Lemma 11.3 with an := ||¢n|| (> |x|] =: a), 


jal — || = lim Jla,1 — z,|| < lima, =a, 
n n 


hence x € A; (by the same lemma). 
Let tp € Ay, Qn 2= |[2y ||, 2 = 1,2, 2 = 21,422, and a := a, 4+ a2 (> ||z||). 
Again by Lemma 11.3, 


lal — a|| = ||(a11 — 21) +(agl —22)|| < ja, 1 —21||+|lacl — x9|| <ajy+a2=a, 


hence x € Ay. 
If  € A, and a > 0, then az is selfadjoint and (ax) = aa(x) C Rt, so 
that ax € Ay. 
Finally, if ¢ € Ay N(—A,), then o(z) C R*+M (—R*) = {0}, so that x 
is both selfadjoint and quasi-nilpotent, hence « = 0 (see the last paragraph of 
Section 7.3). 


Let f : Rt — R* be the positive square root function. Since it belongs to 
C(o(ax)) for any « € Ax, it “operates” on each element « € A; through the 
C(o(x))-operational calculus. The element f(x) € A* is positive (since f > 0 
on o(x)), and f(x)? = x (since the operational calculus is a homomorphism). It 
is called the positive square root of x, denoted x'/?. Note that «!/? € [2], which 
means that it is the limit of polynomials in x, p,(a), where p,, > f uniformly on 
a(a). In fact we have x!/? € A, because if A is not unital, then since f(0) = 0 
we may choose the polynomials to satisfy p,(0) = 0 for every n. Suppose also 
y € Ax satisfies y? = x. The polynomials g,(A) = pn(A*) converge uniformly 
to f(A?) = A on a(y) (since »? € o(y)? = a(x) when » € a(y)). Therefore 
(by continuity of the operational calculus) ¢n(y) > y. But dn(y) = Pn(y?) = 
pn(x) — a'/?, Hence, y = x!/?, which means that the positive square root is 


11.3. Positive elements 329 


unique. Note that conversely, if 2 € A equals y” for a selfadjoint y € A, then x 
is positive. We show two applications of this in the next proposition. 


Proposition 11.5. Let A be a C*-algebra. 


(i) The decomposition « = «+ — x~ of a selfadjoint element x € A, with 
xt,a  € Ay andxta~ =0, is unique. 

(ti) If B is a C*-subalgebra of A, then By = BN Ax. That is, positivity is 
not affected by the C*-algebra is question. 


Proof. (i) If u,v € Ay, u—v =2 and w =0= vu, then (u+v)? =W+v2 = 
(u—v)? = a. Since x?,u+v € A, (see Theorem 11.4), we have ut+v = (a?)¥/?. 
Since u — v = a, we get u= 3(a + (2)!/?) and v = 3((2?)/? — 2). 

(ii) If x € By, it equals y? for a selfadjoint y € B C A, sox € Ay. Conversely, 
if x € A,, then x!/? belongs to the Banach algebra generated by x (in A, without 
the unit 14+ of A*), which is contained in B. Thus x = (a!/?)? is in By. 

(An alternative, straightforward proof of (ii) is via Exercise 26 of Chapter 7.) 


The representation z = y? (with y € A) of a positive element 2 shows in 
particular that « = y*y (since y is selfadjoint). This last property characterizes 
positive elements by Part (i) of Theorem 11.6, proved by Kaplansky based on 
results of Fukamiya and of Kelley and Vaught: 


Theorem 11.6. Let A be a C*-algebra. 
(i) The element x € A is positive iff x = y*y for some y € A. 
(ii) If x is positive, then z*xz is positive for all z € A. 
(iti) If A is a C*-subalgebra of B(X) for some Hilbert space X, then T € A 
is positive iff (Tx,x) >0 for allae X. 


Before we give the proof, recall that in every Banach algebra A, 


o(xy) U {0} =o(yx)U{O} — (Va,y € A) (2) 
(see the remarks following Definition 7.5). 


Proof of Theorem 11.6. 

(i) The preceding remarks show that we only need to prove that x := y*y 
is positive for any y € A. Since it is trivially selfadjoint, we decompose it as 
x = a* —x~, and we only need to show that «~ = 0. Let z = yx~. Then since 
atx =0, 


Se=2 yf ye” =e ee =a (et —2 ec" = -—@). (3) 
But (x~)3 is positive; therefore 


—2tz€ Ay. (4) 


330 11. C*-algebras 


Write z = a+ ib with a,b selfadjoint elements of A. Then a?,b? € A,, and 
therefore, by Theorem 11.4, 
z*e+z2* = 2074207 € AX. (5) 


By (2) and (4), 


o(—z2*) C o(—2z*z) U {0} CRT. 
Thus, —z2* € A,, and so by (5) (cf. Theorem 11.4) 


Zz = (z*2+ 22") + (—22") € Ad. 


Together with (4), this shows that z*z € Ai M(—A,), hence z*z = 0 by 
Theorem 11.4, and therefore z = 0 because ||z||? = ||z*z|| = 0. By (3), we 
conclude that «~ = 0 because «~ is both selfadjoint and nilpotent, as wanted. 

(ii) Write the positive element x in the form x = y*y with y € A (by (i)). 
Then z*xz = z*(y*y)z = (yz)*(yz) € Az again by (i). 

(iii) If T is positive, write T = S*S for some S € A (by (i)). Then (Tz, 2) = 
(Sx,Sx) > 0 for alla e X. 

Conversely, if (Tz, x) € R for all « € X, then (T*z,x) = (a, Tx) = (Tz,x) = 
(Tx, x) for all x, and by polarization (cf. identity (11) following Definition 1.34) 
(T*x,y) = (Tx, y) for all x,y € X, hence T* = T. 

For any 6 > 0, we have 


||(—6I — T)a||? = ||da + Tax||? = 3° [al]? + 2[d(x, Tx)] + ||T2l|? > 6? |a||?, 
because (x, Tx) > 0. Therefore 
(61 —T)al| > dljxl| (Va e X). (6) 


This implies that Ts; := —dI —T is injective (trivially) and has closed range := Y 
(indeed, if Ts, — y, then 


|le2n — &ml| < O~*||T5(@n — Zm)|| + 0; 


hence Jlim z, =: x, and y = lim, T5x, = Tsx € Y). 
If z € Y+, then for all x €_X, since T; is selfadjoint, 


(a, T5z) = (Tsx, z) = 0. 


Hence, 75z = 0, and therefore z = 0 since T5 is injective. Consequently, Y = X, 
by Theorem 1.36. This shows that Ty is bijective. We have ||T;'|| < 6~+ by (6), 
and therefore T;' € B(X). Thus, —6 € pp(x)(T). Also, since T is selfadjoint, 
op(x)(T) C R by Theorem 7.17. Therefore og:x)(T) C R* and hence T € 
B(X),4. From Proposition 11.5 we conclude that T € Ax. 


For « € A; and a > 0, define «* := fa(x) for fa € C(a(x)) given by 
fa(A) := A*®, A € o(a). Then x is also positive, and it belongs to A (and not 
“just” to A*) because 0° = 0. When A is unital and x € A, is invertible, we 
similarly define x® for each a € R. Now «x is both positive and invertible. 


11.3. Positive elements 331 


Give A,, the partial order induced by the positive cone A; (Theorem 11.4): 
for a,b € Aga, a > biffa—b € A,. Ifa,b € As, and x € A, then x*az,xu*ba © Aga, 
and a > b implies x*ax > x*bx by Theorem 11.6. 

We will use repeatedly the following observations. If a € Aga and m € R, then 
a < ml (resp., ,m1l < a) in A* iff max o(a) < m (resp., m < mino(a)). Indeed, 
mi —a = (m-— f1)(a), so the former is positive iff the latter is positive on o(a), 
which means that max o(a) < m. Therefore, for m > 0, we have —m1 <a < ml 
iff |Ja|| < m, and if a € Ay, then a < m1. iff |la|| < m. 


Theorem 11.7. Let A be a C*-algebra and a,b € Aga. 


(i) If -b<a<b, then |lal < |[b|]. 
(ii) If0<a<b, then0<al/? < bl/?, 
(iti) If0O<a<b, A is unital and a is invertible, then b is also invertible and 


0<b'<att. 


Proof. We may assume that A is unital. 
(i) From the observations preceding the theorem’s statement we have 


—|lb|1<-bsasb< oll = ofa)c[-lloll, [el] => llall < all. 


(ii) and (iii): Suppose that 0 < a < b and a is invertible. For Ag := mina(a) > 0 
we have Aol < a < b, so Ag < mino(b), thus b is also invertible. Additionally, 


OS ab sot SL, 
Hence, by the C*-identity, 
1> |o2/2aa-¥>| - |(t/20-¥?)* (a2) _ Jarre] 


2 
_ (20-0) = \(a/?0-4/2) (a 95-1/2)* 


_ ]a2?o-2a0 | 


Since a!/2b-1a1/2 € Aga, we get a!/2b-!a!/? < 1. Therefore 


pls ee ta eS, 
proving (iii). Furthermore, since b~!/4a!/?b-1/4 € A,,, from (2) we obtain that 
|e /4ar2o-™4l| = r(b-M/4qh/25-1/4) = p(ql/2h-1/4p-1/4) 
= r(al/2p-1/2) < 2/20" <i: 
thus 0~!/4q1/2b-1/4 < 1, so 


qi/? = bt/4p-1V/4ql/2p-1/4p1/4 < bi/4qpl/4 _ Be, 


proving (ii). 


332 11. C*-algebras 


In the general case when a is not necessarily invertible, for every r > 0 we 
have 0<a+ri <b+ri1 and a+r is invertible, thus (a+r1)!/? < (b+ r1)¥/?. 
Letting r + 0+ we conclude that a!/? < b'/? by the continuity of the continuous 
operational calculus. 


We end this section with the observation that a *-homomorphism from a 
C*-algebra A to a C*-algebra B preserves positivity: it maps A, into By (see 
Exercise 2). 


11.4 Approximate identities 


A bounded approximate identity for a Banach algebra A is a bounded net {e,} 
of elements of A such that for every a € A, limge,a = a and lima aeg = a. 
One-sided approximate identities are defined analogously. If A is a C*-algebra, 
then we require, in addition, that e, be positive and of norm at most 1 for every 
a, and that {e,} be increasing. (In this case, lim, e,a = a for every a € A iff 
limg aeéq, = a for every a € A, as (a*e€q,)* = eq.) 

In this section we prove that every C*-algebra has an approximate identity. 


Theorem 11.8. Let A be a C*-algebra. Set A := {a € Ax;]|la]] < 1} and give 
A the order inherited from Aga. Let €a := a fora € A. Then f{eabgen is an 
approximate identity for A. 


Lemma 11.9. The map T : Ay 3 4 > x(1 +2)7! is an order-preserving 
isomorphism of Ay onto A. That is: T maps into A, it is 1-1 and onto, and if 
z,ye€ Ay, thna<y = a(1t+az)'<yat+y)t. 


Proof. Let « € A;. Note that f € C(o(x)) given by A > wy has f(0) = 0 if 


0 € o(x). Thus, although the computations are in A*, we have x(1 +2)! = 
f(a) € A. Also f(x) € A because f(a(x)) is a compact subset of [0,1) and the 
continuous operational calculus is isometric. The map A 5 a > a(1—a)~? is the 
inverse of T (check!), so T’ is 1-1 and onto. 

Let x,y € A,. By Theorem 11.7,2< y — = 1+a<1+y (and both are 
invertible!) = > (1+y)7' < (d+a)7) = e(+a2z)t=1-(14+2)'< 
1—(1+y)7-+ =y(1+4+ y)71. This proves that T is order preserving. 


Corollary 11.10. A is upwards directed: for every a,b € A there exists cE A 
such that a,b<e. 


This follows since the partial orders A and A, are isomorphic by the lemma, 
and the latter is upwards directed: if 7, y € A,, thena+y € A; and z,y < x+y. 


Proof of Theorem 11.8. To prove that the net {ea} ce, is an approximate 
identity for A we need only show that limge, ega = a for every a € A. 

If A is commutative, this is simple. Indeed, by the commutative Gelfand— 
Naimark theorem, we can assume that A = C)(Q) for a locally compact 
Hausdorff space 2. Fix f € Co(Q) and € > 0. The set K := {x € 0;|f(x)| > €} 


11.5. Ideals 333 


is compact in 2, so by Urysohn’s lemma there is g : Q — [0,1] in Co(Q) with 
glx = 1. Then for 0 < r < 1 large enough we have || f — rgf|| <« and rg € A. 
In the general case, fix a € A and 0 < € < 1. We may and do assume 
that a is normal. Denote by B Cc A the (commutative, not necessarily unital) 
C*-algebra generated by a. By the previous paragraph applied to 6, there is 
ao € A such that ||a—e,,a|| < €. If a9 < a € A, then (calculating in A*) 
0<1-e,<1-—e,, <1, hence 0 < a*(1 — e,)a < a*(1 — eg, )a, so 


2 2 
ja — eaal|” = ||(. - €a)all” = || - e0)/?( - ea) 2a < ||(a - 0)" 


= |la*(1 — ea)all < |la*(l — ao all < llall || — ea all < Tall e. 


Thus limgca Eg = G. 


11.5 Ideals 


The two main results of this section say that closed ideals of C*-algebras, as well 
as quotients by them, are C*-algebras. Remarkably, such ideals are automatically 
selfadjoint! This is established with the aid of approximate identities. 


Lemma 11.11. If L is a closed left ideal of a C*-algebra A, then L (as a Banach 
algebra) has a bounded right approximate identity {e.}, which is increasing, and 
all of its elements are positive and of norm at most 1. 


Proof. Set B := LOL*. Then B is a C*-subalgebra of A and L*L C B. Let {ea}, 
be an approximate identity for B (see Theorem 11.8). If a € LZ then a*a € B, 
and thus 0 = lim, (a*a — a*ae,,). Consequently, 


lla — aeal|” = ||(L — ea)a*a(l — ea)|| < |la*a(d — ea) =F 0 


(calculating in A), so aeg > a. 
a 


Theorem 11.12. Let I be a closed ideal in a C*-algebra A. Then I is selfadjoint 
(i.e., closed under the involution operation), hence it is a C*-subalgebra of A. 


Proof. Let {e,}, be as in the last lemma. Then for every a € I, we have 
a = limg aég, so a* = limg e,a* € I because eg € I for all a and I is a closed 
right ideal. 


Convention. Henceforth, unless otherwise stated, an ideal of a C*-algebra will 
always be (two sided and) closed. 


Theorem 11.13. Let I be an ideal of a C*-algebra A. Put (a+ I)* := a* +I 
(a € A). Then this operation is as involution on A/I, and A/I is a C*-algebra. 


Proof. We already know that A/I is a Banach algebra; see Section 7.1. The 
function a+ I > a* +I (a € A) on A/T is well defined by the selfadjointness of 
I (Theorem 11.12). It is then easily seen to be an involution on A/T. 


334 11. C*-algebras 


It remains to prove the C*-identity. Let {e,}, be an approximate identity 
for I (cf. Theorems 11.12 and 11.8). 

Claim: for all a € A we have |a + I|| = limg ||a — aeq||. Indeed, for all a, we 
have ||a + I|| < ||a— aeq|| as eg € I and hence aeg € J. On the other hand, for 
every € > 0 there is b € I with ||a — 6|| < ||a + Z|| + «. Thus 


Ila — deal] =||(a@ — b)(1 — ea) + (0 — bea) || < lla — | || — ea)| 
—S—SEa—nr—rs 
<|Ja+I||+e <1 

+ ||b — bea|l, 
—~|__—_— 
—o0 


so lim sup, ||a — aeq|| < ||a+ || + €. This holds for all € > 0, proving the claim. 
Take a € A. By the claim, 


la + ral = lim ||a - aeg||° = lim ||(1 — eg )a*a(1 — eg)|I 


< lim |/a*a(1 — eq)|| = |la*a + II] . 


The C*-identity follows from this by Exercise 23 of Chapter 7. 


A beautiful application of this theory is the following result, which improves 
Theorem 11.2. 


Theorem 11.14. Let A,B be C*-algebras and y: A > B a *-homomorphism. 
Then: 


(i) p(A) is a C*-subalgebra of B. 
(ti) If p #0 then |p| = 1. 


Proof. (i) The ideal kery is closed in A because y is continuous by 
Theorem 11.2. Consider the induced homomorphism ¢ : A/kery — B, 
a+kery > y(a). It is an injective +homomorphism between two C*-algebras 
by Theorem 11.13. Therefore, it is isometric by Theorem 11.2. Thus, its image 
p(A/ker vy) = (A) is a C*-subalgebra of B. 

(ii) We already established that ||y|| < 1 in Theorem 11.2. Let {e,}, be an 
approximate identity for A. Pick a € A such that y(a) 4 0. Then as aeg = 4, 


we have y(a)y(ea) = v(aea) = (a). Thus 


elayll — Ile@elea)ll < lleC@ll Mell leall < IkeC@)Il ell - 


Hence ||y]|| > 1. 


We can now prove an analogue of one of Noether’s “isomorphism theorems”: 


Corollary 11.15. Let A be a C*-algebra, B be a C*-subalgebra of A, and I 
an ideal of A. Then B +I is a C*-subalgebra of A, and there is a natural 
«-isomorphism from (B+ I)/I onto B/(BO I). 


11.6. Positive linear functionals 335 


Proof. Let g: A— A/I be the quotient mapping. Its restriction g|g :B — A/I 
has closed image by Theorems 11.13 and 11.14. Thus the *subalgebra B + I = 
¢ ‘(q(B)) of A is closed, and hence it is a C*-subalgebra of A. Since q|g is a 
*-homomorphism from B onto (6 + I)/I whose kernel is BM J, it induces the 
desired *-isomorphism. 


Example 11.16. Take A := Co(Q), where 2 is a locally compact Hausdorff 
space. We know that ®(A) = {¢;¢ € Q}, where ¢ is the “evaluation at t” 
character given by f — f(t) (Exercise 17 in Chapter 7). We now use this and 
Theorem 11.13 to characterize all (closed) ideals of A. 

If S C Qis closed, then I(S) := {f € A; f|g = 0} is an ideal of A. Conversely, 
all ideals of A have this form. Indeed, if J is an ideal of A, write S(I) := 
{t € O; f(t) = 0 for all f € I}. Then S is closed and evidently I c I(S(J)). The 
quotient A/I is a commutative C*-algebra. Let ¢ : A > A/I be the quotient 
map. Then {yog;p € ®(A/I)} = {6 € BA); ) = {0}} = {Ort © SU)}. 
Therefore, for f € A, we have fel <— q(f)=0 <— ov(a(f)) = 0 for 
every y € ®(A/I) <=> f(t) =O forallt e S(1) = f € I(S(J)). Hence 
I=1(S(1)). 

The map I > S(J) is a order-reversing isomorphism between the poset of all 
ideals of A and the poset of all closed subsets of 2, whose inverse is S > I(S). 
Indeed, this holds true because the latter map is injective by Urysohn’s lemma. 


Example 11.17. Let X be a Hilbert space. For x,y € X, denote by 02, the 
element (-, y)x of K(X). Then span {0,,,;2,y € X}, which is the set of finite rank 
operators in B(X), is dense in K(X) (see Part (d) of Exercise 8 in Chapter 6). 

We claim that every ideal Z in B(X) contains K(X). Fix 0 € T € T, and let 
w € X with Tw # 0. For every 1,y € X, we have 


1 
—— 58 
||| 


xy x,Tw oT 0 Aw y ef. 

From the foregoing we conclude that K(X) C TZ. In particular, K(X) is a simple 
C*-algebra. If X is an infinite-dimensional separable Hilbert space, then one can 
prove that the only non-trivial ideal in B(X) is K(X). As a result, the Calkin 
algebra B(X)/K(X) is a simple C*-algebra. 


11.6 Positive linear functionals 


Let A be a C*-algebra. A linear functional on A is called Hermitian (positive) if 
it is real-valued on Aga (non-negative-valued on A,, respectively). 

Clearly, the linear functional ¢ is Hermitian iff 6(a*) = (a) for all « € A 
(this relation evidently implies that ¢(a) is real for x selfadjoint; on the other 
hand, if ¢ is Hermitian, write 7 = a+ ib with a,b € Asa; then 2* = a — ib, 
and therefore ¢(a*) = $(a) —i¢(b) is the conjugate of ¢(a) = (a) + i¢(b), since 
¢(a), 6(b) € R). In other words, ¢ is Hermitian iff it equals the linear functional 
~* on A given by x > ¢(a*). In this case we have R(¢(x)) = o(Rax) for every 
cEA. 


336 11. C*-algebras 


Every linear functional ¢ on A can be expressed uniquely as ¢ = $1 + ido, 
where ¢1, ¢2 are Hermitian linear functionals on A. Indeed, $1 := $(¢+¢*) and 
$2 := #(¢— ¢*) do the job, and uniqueness is easy. 

If is positive, it is necessarily Hermitian (write any selfadjoint 1 as xt — 
xz; since ¢(x*),¢(a~) € R™, we have 6(x) = (xt) — o(x~) € R). It is also 
monotone on Aga: if a,b € Asa and a < b, then (a) < (6). 

Examples. 


1. If Q is a locally compact Hausdorff space and yp is a finite positive Borel 
measure on Q, then w(f) := fa f du defines a positive linear functional on 
Co(Q). By the Riesz—Markov representation theorem (3.18) or the Riesz 
representation theorem (4.9) and their proofs, all positive linear functional 
on Co(Q) have this form (and ~ may be assumed regular). 

2. For n € N fixed, the trace tr: M, — C (given by A = (aij) > S07_, ais) is 
positive, because if A = B*B for B € M,, then tr(a) = ijet bjib;i;. More 
generally, if H € (M,,)+, then try : M, — C given by try(A) := tr(HA) 
is positive. All positive linear functionals on M,, are of this form. 

3. If X is a Hilbert space and A is a C*-subalgebra of B(X), then for each 
xz € X, the vector functional on A corresponding to x is wz : A > C given 
by A3 A- (Az, x). It is positive by Theorem 11.6. 

4. If A is a C*-algebra we define ®(A) as we did in the commutative case: 
it is the set of all characters of A, namely, the non-zero homomorphisms 
from A to C. (When A is not commutative, it is possible that ®(A) be 
empty!) The proof of Lemma 7.15 shows that each character is, in fact, a 
*-homomorphism from A to C. Thus, every character ¢ € ®(.A) is positive, 
because $(a*a) = |¢(a)|? > 0 for all a € A. 


We now connect positivity of a linear functional to its norm. 


Theorem 11.18. Let A be a C*-algebra andw:A—C be a linear functional. 
Then the following are equivalent: 


(i) w is positive; 
(ti) w is bounded, and for every approximate identity {e.} of A we have 
||w|| = lima w(eq); and 
(itt) w is bounded, and for some approximate identity {ea} of A we have 
||w|| = lima w(eq). 
In particular, if A is unital, then w is positive iff it is bounded and w(1) = ||w||. 


Before proving the general case, it is constructive to first look at the case that 
A is unital. Let us prove that if w is positive, then it is bounded and w(1) = ||w). 
Notice that 1 € A,, hence w() > 0. If a € Aga, then —||a||1 < x < |la||1, 
thus —w(1)||z|| < w(a) < w(L)||2|| by positivity of w. Therefore |w(x)| < w(1)]||z|| 


11.6. Positive linear functionals 337 


for all « € Aga. Next, for x € A arbitrary, write the complex number w(z) in its 
polar form |w(z)|e, @€ R. Then 


|w(x)| = e w(x) = w(e x) 
= Ru(e Vx) = w(Rle72]) < w(L)|| Rea] < w(L) Ila. 

This shows that w is bounded and ||w|| < w(1). On the other hand, w(1) < 
Il] |2|] = [lel]. Therefore |||] = w(1). 

We turn to the proof of the general case. For a positive linear functional 
w:A—>C, put 

(2, yw = wy") (a, y € A). 

Since w is positive and x*x € A, for every x € A by Theorem 11.6, the form 
(-,-)y : Ax A— C is a semi-inner product. Its induced semi-norm on A is 


lItlla = (@, a)? =w(a*a)/? (we A). 
By the Cauchy—Schwarz inequality for this semi-inner product, 
Jo(y*2)| = |(@,9).1 S llellollylls = o(a*2)'Pw(y*y)? (We,ye A). (1) 


Proof. 

(i) = > (ii) Let w: A C bea positive linear functional. We first show that 
M := sup {w(a);a € Ax, |lal| < 1} is finite. Else, there is a sequence {a,,}>-, 
in Ax with |ja,|| < 1 and w(a,) > 2” for every n € N. The series >°  seGn 
converges absolutely, thus converges in A to an element a. For every N € N, 


N ee) N 
1 1 
a— ) an n= ) gn an 20, so w(a) > w(S gran) 2N 
n=1 n=N+1 n=1 


by the positivity of w; a contradiction. Thus M < oo. 

For general a € A, by decomposing a = x + iy, x,y € Aga, ||2|I , |lyl] < llall, 
and further g = xt -—a2-,y=yt—y, z*,y* € Ay, ||v4]], llyel| < llall, we 
infer that ||w|| < 4M. In particular, w is bounded. 

Assume for convenience that ||w|| = 1. Given an approximate identity {e.} 
of A, the net {w(e,.)} is increasing because {e,} is increasing by definition and 
w is positive, so it converges to its supremum m, which is at most ||w|| = 1. For 
every a € A with ||a|| < 1, we have by the Cauchy—Schwarz inequality (1) 


|w(eqa)|? < w(e2)w(a*a) < w(ea) <m (Va). 


Since lim ea = a and w is continuous, we get |w(a)|” < _m. Hence 1 < m <1, 
proving that m = 1. 

(iii) = > (i) Assume that |lw|| = limgw(e,) = 1 for some approximate 
identity {e,}. We first show that w is Hermitian. Let a € Ag, with |la|| < 1. 
Write w(a) = ¢ +in where ¢,7 € R. For all t € R, 


|w(a) + itw(eg)|? = |w(a + itea)|? < |la + iteg||? = ||(a + itea)* (a + iteg)|| 


- Ia? +te* +it(aeg — €oa)|| 1+¢7 + |é| - laeg — egal . 


IA 


338 11. C*-algebras 


Using that lim, w(e,) = 1 and lima (aeg — ega) = 0, we infer that 
CF 4? + Qt +4? = (C+ in + it|? = [w(a) + it? <1 +27. 


Hence, ¢? + 7? — 1 < —2nt, and since t € R was arbitrary, we must have n = 0, 
namely w(a) € R. We proved that w is Hermitian. 
Finally, let a € A; be such that |la|| < 1. For every a, eg — a € Aga and 
1<-a<e,-a<1—a<1 in JA*, so lle, — all < 1. Since, by the foregoing, 
w(e, — a) € R, we get 1 — w(a) — w(eg) — w(a) = w(eg — a) < 1, so w(a) > 0. 


This completes the proof. 


We write A‘. for the set of all (bounded) positive linear functionals on A. 
Evidently, A%, is a weak” -closed positive cone in A*. We give the set of Hermitian 
functionals in A* the partial order induced by this positive cone: for Hermitian 


Corollary 11.19. If A is a C*-algebra and w1,w2 € A‘, then ||w; + wal] = 
Ileal] + Ilw2 I]. 


Proof. For an approximate identity {e,} of A, by Theorem 11.18 we have 


Ilv1 + wal] = lim(w1 + we) (ea) = limwr (ea) + limw2(ea) = [wil] + |lw2ll - 


The next result is concerned with the existence of positive linear functionals. 
It will be vital to the proof of the non-commutative Gelfand—Naimark theorem. 

A positive linear functional w on A that is normalized, that is, ||w|| = 1, is 
called a state of A. The set of all states of A will be denoted by S(A). It will 
play in the non-commutative case a role as crucial as the role that ®(A) played 
in the commutative Gelfand—Naimark theorem. Notice that ®(A) C S(A). 


Theorem 11.20. Let A be a C*-algebra and write S := S(A). Then, for each 
LEA: 
(i) if A is unital then o(x) C {w(x);w € S}; 


(ii) if « is normal, then ||a|| = max, es |w(x)|; for x arbitrary, ||x|| = 
maxwes |||, where we recall that ||x||?2, = w(x*x); 


(iti) if w(v) =0 for allw € S, then x = 0; 
(iv) if w(a) ER for allw € S, then x is selfadjoint; and 
(v) if w(x) ER* for allw ES, thenze€ Ay. 


Also, if B is a C*-subalgebra of A, then 
(vi) every w € BY extends to some © € A¥, with the same norm. 


Proof. (i) Let A € a(x). Then for any a,8 € C, aA +8 € o(ax+ 1), and 
therefore 
lad + B| < llax + II). (2) 


11.6. Positive linear functionals 339 


Define wo : Z := span{x, 1} > C by 
wo(ax + 61) :=art.+ B (a,B €C). 
If ag + 61 = a’ + f'l, then by (2) 


(aA + B) — (a’A + B')| = |(a—a’)A + (BB) 
< ||(a—a')a + (8 — B)1|| = 0. 


Therefore wo is well defined. It is clearly linear and bounded, with norm at most 1 
by (2). Since wo(1) = 1, we have ||wo|| = 1. By the Hahn—Banach theorem, wo has 
an extension w as a bounded linear functional on A with norm ||w|| = ||wo|| = 1. 
Since also w(1) = wo(1) = 1, it follows from Theorem 11.18 that w € S, and 
A = wo(x) = w(x). 

(ii) Since all states are contractive, we have sup,,¢g5 |w(x)| < ||z|| for any a. 
When x # 0 is normal, we have r(x) = ||z/||, and therefore there exists \1 € o(2) 
such that |A;| = ||x||. By (i) applied to A#, A, = wi (x) for some w* € S(A*). 
If A is unital, we have finished: take w, := wi If not, the restriction w, := wi | 


is positive, |lwil] < llwfl| = 1, and Jwi(q2)2)| = lw? (qdp2)l = WEP = 1, so 
|w1|| = 1. This shows that the supremum is a maximum, attained at w, € S, 
and is equal to |||. 

For «x arbitrary, we apply the preceding identity to the selfadjoint (hence 
normal!) element x* 2: 


2 __ * = * —_ 20 
llZI° = [2*2|| = max |w(x*2)| = max ||e|[, 


(iii) Suppose that w(x) = 0 for all w € S. Write x = a+ ib with a,b € 
A selfadjoint. Since w is real on selfadjoint elements (being a positive linear 
functional, hence Hermitian), the relation 0 = w(x) = w(a) + iw(b) implies that 
w(a) = w(b) = 0 for all w € S. By (ii), it follows that a = b = 0, hence x = 0. 

(iv) If w(x) € R for all w, then (with notation as in (iii)) w(b) = 0 for all w, 
and therefore b = 0 by (ii). Hence x = a is selfadjoint. 

(v) If w(x) € R®* for all w € S, then z is selfadjoint by (iv). We also have 
w(x) € Rt for allw € A‘, thus w#(x) € R* for all w# € (A*)) as w#|4 € At. 
Consequently o.4(2) = o4#(x) C R* by (i); hence 2 € Ay. 

(vi) Claim: suppose that C is a C*-algebra, either unital or non-unital, and 
w € Ci. Let C be a unital C*-algebra containing C and spanned by C and the 


unit 1 of C, 1 ¢ C. Then we can extend w to @ € Ce with the same norm. 

Proof of Claim: extend w linearly to a functional © on C by letting &(1) := 
||w||. Let {e,} be an approximate identity for C. For every 7 € C and ¢ € C, by 
Theorem 11.18, 


Jo(x + CU)| = w(x) + ¢ ||w||] = lim |w(wea) + Cw(Ea)| = lim |w(wea + Cea)] - 
But for every @ we have 
|u (tea + C€a)| S [lol] |!eea + Call = [lull (@ + Leal 
S loll le + Cl Meall S Tell la + COI. 


340 11. C*-algebras 


Therefore |@(a+¢1)| < ||w|| |] -+¢1||. In other words, ||@|| < ||w||, whence 
||@|| = ||w|| = @(1L), so & is positive by Theorem 11.18. 

By the claim, we may assume that A and B are unital with 1 := ly € B. 
Using the Hahn—Banach theorem, extend w to w € A* with ||@|| = ||w||. Thus 
\||| = ||w|| = w(1) = &(1), so @ € A* by Theorem 11.18. 


11.7 Representations and the Gelfand—Naimark— 
Segal construction 


In this section we establish the existence of “enough” representations, and prove 
the non-commutative Gelfand—Naimark theorem, saying that every C*-algebra 
has a faithful representation on some Hilbert space. In other words, every C*- 
algebra can be thought of as a C*-subalgebra of some B(X)! 

Let A be a C*-algebra. A (C*-) representation of A on a Hilbert space X is 
a *-homomorphism 7 : A > B(X). We say that 7 is faithful if it is injective, or 
equivalently, if it is isometric (see Theorem 11.2). If A is unital and m(1) = J, 
we say that 7 is unital. 

If there exists a vector ¢) € X such that 7(A)Gp := {m(a)Go; a € A} is dense 
in X, then 7 is called cyclic and Co is said to be a cyclic vector for 7. 

If 7(A)X := span{a(a)¢;a € A,¢ € X} is dense in X, then 7 is called 
non-degenerate. 

A cyclic representation is evidently non-degenerate. Also, if A is unital, then 
m is non-degenerate iff it is unital. 


Example 11.21. Let 9 be a locally compact Hausdorff space and A := Co(Q). 
Let yz be a positive Borel measure on 2 satisfying the assumptions of Lusin’s 
theorem (3.20). Set X := L?(Q,). For every f € Co(Q), let My € B(X) be 
given by g > fg (g € X). Then 7: A > B(X) given by a(f) := My (f € A) 
is a representation of A on X. It is non-degenerate because C.(Q) (C Co(Q)) is 
dense in X by Corollary 3.21 and because every function in C,(Q) is the product 
of two functions in C,(Q). If Q is compact, then 7 is unital. If py is finite, then 
the constant function 1 belongs to X and is cyclic for 7, again because C,(Q), 
thus Co(Q), is dense in X. If the support of y is all of Q, that is, if u(V) > 0 
for each non-empty open set V Cc 2 (see Section 3.5), then 7z is faithful, as if 
0 ~ f € A then there is a non-empty open set V such that f does not vanish 
anywhere on V and p(V) < oo; then the indicator g := Ivy of V belongs to X 
and Mrg £0 in X. 


Let w € S := S(A) be a state. Recall from Section 11.6 that w induces on A 
the semi-inner product (s.i.p.) given by 


(r,y)w i= wy") = (a, y € A). 
The induced semi-norm 


lIello = (@, 2)? =w(ara)/? (we A) 


11.7. Representations and the Gelfand—Naimark—Segal construction 341 


is continuous on A (by continuity of w, of the involution, and the multiplication). 
Also, if A is unital, then it is “normalized”, that is, ||1||,, = 1 (because w(1) = 1 
by Theorem 11.18). 
Let 
Jug = {u € Aj |||. = 0} = |] - 157 ({0})- 


The properties of the semi-norm imply that J, is a closed subspace of A. 

By the Cauchy—Schwarz inequality (see (1) of Section 11.6), if  € J, then 
(2, y)w = (y,£)y = 0 for all y € A. This implies that J,, is left A-invariant (i.e., a 
left ideal), because for all € J, and ye A 


llyel2, = w((ye)"(ye)) = w((y*ye)*2) = (@,y*ye)w =0. 
Lemma 11.22. Let w € S(A). Then for all x,y € A, 


Il7yllo S Weigle. 


Proof. Since z*x < ||z*a||1 (in A*), we have y*a*xy < |la*a|ly*y by 
Theorem 11.6, so by monotonicity, 


2 * Ok * * 2 2 
Izyl|., = w((2y)* (xy) = w(y*2* zy) < |a* xl] w(y*y) = |lzI/" llylle - 


The s.i.p. (-,+) induces an inner product (same notation) on the quotient 
space A/J,,: 
(t+ Ju yt Iu)w = (2, Y)w (x,y € A). 
This definition is independent on the cosets representatives, because if c+ J, = 
vt+J,andy+J,=y' + Ju, then «—2’,y—y' € Jy, and therefore (dropping 
the subscript w) 


(2',y') = (a’ — 2, y') + (2, y) + (2, y' — y) = (2,9) 


(the first and third summands vanish, because one of the factors of the s.i.p. is 


in J). 
If lla + Jul? = (@ + Ju,o + Ju)y = 0, then ||z||2, = (z, x), = 0, that is, 
x € J, hence x + J, is the zero coset J. This means that || - ||, is a norm on 


A/J.,. Let X., be the completion of A/J,, with respect to this norm. Then X,, 
is a Hilbert space (its inner product is the unique continuous extension of (-,-). 
from the dense subspace A/J,, x A/J,, to X., x X.,; the extension is also denoted 
by (-,+)u): 

For each x € A, consider the map 


T(X) = Ty(x): A/ Jy > A/ Ie 
defined by 
n(x)\(yt+ Ju) =xyt+ Jd, (ye A). 


It is well defined, because if y, y’ represent the same coset, then ||y — y’||., = 0, 
and therefore, by Lemma 11.22, ||x(y— y’)||. = 0, which means that ry and xy’ 
represent the same coset. 


342 11. C*-algebras 


The map (zx) is clearly linear. It is also bounded, with operator norm 
(on the normed space A/J,,) ||7(x)|| < ||a||: indeed, by Lemma 11.22, for all 
yeA, 


IIr(x)(y + Jo)Ilo = [ley + Julho = lleylle < [arllllyllo = Weliily + Jello 


Therefore, 7(a) extends uniquely by continuity to a bounded operator on X,, 
(also denoted 7(x)), with operator norm ||m(x)|| < ||z|I. 

A routine calculation shows that « — 7(x) is an algebra homomorphism of A 
into B(A/J,,), and a continuity argument implies that it is a homomorphism of 
A into B(X,,). If A is unital, then since 7(1) is the identity operator on A/J., 
we have m(1) = I, the identity operator on X,,. 

For all x,y, z € A, we have (dropping the index w) 


(x(@)(y+ J), 2+ J) = (ay t+ J,2+ J) = (ay, 2) = w(z* ay) = w ((a*z)"y) 
=(y.0z)=(ytIa%z+ J) = (yt Jaa" )(z+ J)). 
By continuity, we obtain the identity 


(x(a)C,m) = (6,7 (@")n) (6. € Xu), 


that is 
(m(x))" =n(a") (we A). 


We conclude that 7 : « — a(x) is a (norm-decreasing) *-homomorphism of 
A into B(X,,), namely, it is a representation of A on the Hilbert space X., 
which is unital if A is. The construction of the “canonical” representation 7 
is referred to as the Gelfand—Naimark-Segal (GNS') construction; accordingly, 
T = Ty 1 — 71,(x) will be called the GNS representation associated with the 
given state w on A. 

Assume for the moment that A is unital and consider the unit vector ¢, := 
1+J, € X, (it is a unit vector because ||G,||4 = |[1||. = w(1*1)!/? = 1). By 
definition of X.,, the set 


{m(x)Qu;2€ Ab={et+J,;0 € AS =A/I, 


is dense in X,,. Consequently, the representation 7: « + m(a) is cyclic, with 
cyclic vector ¢,,. Note also the identity (dropping the index w) 


w(x) = w(L"x) = (2,1) = (e+ J+ J) = (a(x) + J), 1+ J) = (a(2)C,¢). 
(1) 
Thus, the state w is realized through the representation 7 as the composition 
we om, where w¢ is the so-called vector state on B(X,,) corresponding to the unit 
vector ¢ defined by 


w¢(T) = (T¢,¢) (Te B(X.,)). 


(see the examples in Section 11.6). One can show that a cyclic unit vector ¢ = 
¢ € X, for 7 with the same property w = we o 7 exists also when A is not 
unital; see Exercise 17. 


11.7. Representations and the Gelfand—Naimark—Segal construction 343 


The GNS representation is “universal” in a certain sense which we proceed 
to specify. 

Suppose 7’ : « > 1’(x) is any cyclic representation of A on a Hilbert space 
Z, with unit cyclic vector 7 € Z such that w = w, 07’. Then by (1), for all 
Lea, 


* 


II" (x)nllz = (n'(x)n, a" (w)n)z = (n'(x* 2), n)z 
= (w, o7')(a*x) = w(a*a) = (m(a*2)C,C)u = IIm(z)cllo- (2) 


Since 7 and 7’ are cyclic representations on X,, and Z with respective cyclic 
vectors ¢ and , the subspaces X,, := {m(x)¢;a € A} and Z := {n'(x)n; x € A} 
are dense in X,, and Z, respectively. Define U : X,, — Z by 


U(n(a)¢) = m'(a)n (x € A). 


It follows from (2) that U is a linear isometry of X,, onto Z. It extends uniquely 
by continuity as a linear isometry of X, onto Z. Thus, U is a Hilbert space 
isomorphism of X,, onto Z. For all z,y € A we have 


(Un(a)) (w(y)¢) = Un(ay)¢ = '(ay)n = '(x)a"(y)n 
= n'(x)(Un(y)¢) = (n"(x)U) (m(y)¢). 


Thus, Un(x) = x’(x)U on the dense subspace X., of Xj; by continuity of the 
operators, it follows that Um(x) = 2'(x)U, that is, x(a) = Ur(x)U* foralla € A 
(one says that the representations 7’ and 7 are unitarily equivalent, through the 
unitary equivalence U : X,, > Z). When A is unital, U carries ¢ onto 7 because 
U¢ = UIC = Un(A1)¢ := n'(1)n = In = 0, where we use the notation J for the 
identity operator in both Hilbert spaces. This is also true when A is not unital; 
see again Exercise 17. This concludes the proof of the following 


Theorem 11.23 (The Gelfand—Naimark—Segal theorem). Let w be a state 
of a C*-algebra A. Then the associated GNS representation 7 := Ty is a cyclic 
representation of A on the Hilbert space X := X,, with a unit cyclic vector 
¢ := Gy such that w = weon. If a’ is another cyclic representation of A on a 
Hilbert space Z with unit cyclic vector n such that w = w,on"', then n' is unitarily 
equivalent to 7 under a unitary equivalence U : X + Z such that UG =n. 


Example 11.24 (The GNS construction of integration states of Co(Q)). 
Let 2 be a locally compact Hausdorff space and ys be a regular probability 
Borel measure on 9. The representation of Co(Q) on L?(Q, 4) constructed in 
Example 11.21 (a(f) = My) is (unitarily equivalent to) the GNS representation 
of the state w : Co(Q) > C, f > fg f du. Indeed, the constant function 1 is 
cyclic for 7, and for all f € Co(Q), (w(f)1,1) = (f,1) = Jaf Tdu = (Sf). 


Example 11.25 (The GNS construction of vector states). Let X bea 
Hilbert space, A be a C*-subalgebra of B(X), ¢ € X be a unit vector, and w := 
we|,a. Suppose that A is non-degenerate on X, i.e., Span{a@n;z € Ane X} =X. 


344 11. C*-algebras 


Then the subspace Y := AC is A-invariant (in fact, it is reducing for every 
element of A), and we let the reader verify that ¢ € Y. Obviously, the 
representation 7 : x — aly of A on the Hilbert space Y has ¢ as a cyclic 
vector and w = we o 7. Thus, the GNS representation of A associated to w is 
unitarily equivalent to 7. 


Let X be the Hilbert space direct sum 


X:= 5) @X, 


wEeS 
(recall the construction of Section 8.6). For each x € A, define 7(x) € B(X) by 
T(@) = S- Oty (x). 
wES 


This makes sense because ||7.,(x)|| < ||a|| for each w € S, from which we also get 


||z(a)|| < ||a||. Actually, we have ||7(2)|| = |||]. Indeed, for each w € S, consider 
the vector f,, € X defined by 
fa(w) = Ga} fu(p):=0 (pES,p Fw). (3) 


Then || ful] = [[Collw = 1 and 


Ilr(2) foll = Imo (@) Colle = alle 
Hence, ||a(a)|| > ||a||.. for all w € S, and therefore 


> — 
II7(x)I| 2 max [|z||o = ||| 


by Theorem 11.20 (ii). Together with the preceding inequality, we obtain 
||r(x)|| = ||z|| for all 2 € A. 

An easy calculation shows that the map 7 : « > 7(x) of A into B(X) is an 
algebra homomorphism. Also 1(a*) = (7(a))* because for all f,g € X, 


(n(x*)f.9) = D0 (mola) f(), 9()),, = D5 ((mu(2))* F(&), 9), 


wes w 


=> (fw), m(x)9)),, = (F. 7 ()g)- 


Ww 


Thus, a is an isometric *-isomorphism of the C*-algebra A onto the C*- 
subalgebra 7(A) of B(X), namely a faithful representation of A on X. It is 
non-degenerate because each 7,, is non-degenerate. If A is unital, so is 7, that 
is, 7(1) = I. The usual notation for 7 is }0.¢5 ®t} it is called the direct sum 
of the representations {7.,},,¢5- This particular representation of A is usually 
referred to as the universal representation of A. We thus proved the following 
theorem, which can be considered the starting point of C*-algebra theory. 


11.7. Representations and the Gelfand—Naimark—Segal construction 345 


Theorem 11.26 (The non-commutative Gelfand—Naimark theorem). 
Every C*-algebra A admits a faithful non-degenerate representation, that is, A 
is isometrically *-isomorphic to a non-degenerate C*-subalgebra of B(X) for 
some Hilbert space X. 

A special such representation (called the “universal representation”) of the 
C*-algebra A is the direct sum representation 7 := )ojceg@T. on X = 


dVwes OXu of the GNS representations {nm} <5, where S := S(A). 


The universality property of the universal representation 7 is that every state 
of m(A) (C B(X)) is a vector state. Indeed, every state T of (A) is of the form 
wom + for some state w € S. Consider the vector f := f,, € X of (3). Since 
W=Wweom, for ¢ := ¢., it is clear that 7 = wy|_/4). See also Exercise 24. 


11.7.1 Irreducible representations 


In this short section we introduce and characterize a special type of 
representations of C*-algebras, namely irreducible ones. An elementary way to 
construct such representations appears in Section 11.8. 

We start with several notions and a (very useful) lemma that connects them. 
For a Hilbert space X and a subset R C B(X), the commutant R’ of R (in 
B(X)) is the set 


R' := {S € B(X);ST =TS for allT € R}. 
Say that a closed subspace of X is R-invariant if it is T-invariant for each T € R. 


Lemma 11.27. Let X be a Hilbert space and R be a *-subalgebra of B(X). 
Consider the following conditions for a projection P € B(X): 


(i) PER; 
(ii) the range of P is R-invariant; 


(iii) the range of P has the form RS, where S C X and RS := span{T2;T € 
R,x € Sh. 


Then (i) = > (ti) <= (iii), and all conditions are equivalent when R is non- 
degenerate, that is, RX =X. 


Proof. (iii) => (ii) Immediate, because R is stable under multiplication. 
(ii) => (i) For each A € R, the range of P is invariant for both A and A* 
(because A* € R, by selfadjointness of R), so by Terminology 8.5, Point (c), we 
have PA = AP. That is, PE R’. 
(i) => (ii) For each « € X and T € R, TPx = PT z is in the range of P. 
Finally, if R is non-degenerate then (i) => (iii), because the range PX of a 
projection P € R’ equals R(PX). Indeed, R(PX) = P(RX) since P € R’, and 
we have PX = P(RX) C P(RX) C PX = PX. 


Let a be a representation of a C*-algebra A on a Hilbert space X. If Y 
is a m(A)-invariant closed subspace of X, the function from A to B(Y) given 


346 11. C*-algebras 


by a > m(a)|y, a € A, is (well defined and) a representation of A on Y. Such a 
representation of A is called a sub-representation of 7. 

We say that 7 is irreducible if it has only trivial sub-representations, that is, if 
the only 7(A)-invariant closed subspaces of X are {0} and X. By Lemma 11.27, 
this means that the only projections in the commutant 7(A)’ are 0 and the 
identity operator I. Since the projections in 7(A)’ span a norm-dense subset of 
m(A)! (see Section 12.1), this just means that the commutant is trivial: 7(A)’ = 
Cl. 

Irreducibility of 7 has various implications. For instance, it is evidently non- 
degenerate. But more is true: if ¢ € X is non-zero, then the closed subspace 
m(A)¢ is m(A)-invariant (and is also non-zero!) and thus equals X. In other 
words, every non-zero vector in X is cyclic for 7. The converse is also true: if 
every non-zero vector in X is cyclic for a then 7 is irreducible, because every 
non-zero closed subspace of X invariant under A contains 7(A)¢ = X for some 
non-zero ¢ € X. 


11.8 Positive linear functionals and convexity 


This section discusses notions and results on positive linear functionals that are 
related to convexity. Again, A is a C*-algebra and S(A) is the set of all states 
of A. 

In this section the proofs in the non-unital case sometimes require a small 
technical change. The reader can safely assume that A is unital in first reading. 

The set S(A) = {w € A4; ||w|]| = 1} is convex in A* by Corollary 11.19. The 
set K(A) := {w € A4; ||w|| < 1} is convex as the intersection of two convex sets: 
A‘_ and the norm-closed unit ball of A*. The first of these sets is weak*-closed 
and the second is weak*-compact by Alaoglu’s theorem, so K(A) is also weak*- 
compact. The set S(A) is weak*-compact when A is unital by Theorem 11.18. 


11.8.1 Pure states 


A state w of A is called pure if every p € A%, dominated by w, that is, p < w, isa 
multiple of w, that is, p = cw for some c € [0,1]. Notice that p € A‘, is dominated 
by w iff there exists p’ € A‘ such that p+ p’ = w; and in this situation we have 
lel + ||o”|| = ||w|| = 1 by Corollary 11.19. Consequently, the pure states of A are 
precisely the extremal points of the convex set S(A). 

Denote the set of all pure states of A by P(A). 

If A is unital, then since S(A) is weak*-compact and convex in A*, the 
Krein-Milman theorem implies that S(A) = co"°*"” (P(A)). 

For general A, we assert that P(A) U {0} is the set of extremal points of the 
(weak*-compact and convex) set K(A), so the Krein—-Milman theorem implies 
that K(A) = cover” (P(A) U {0}). Indeed, if w € Aj, and 0 < ||w|| < 1, then 


w is not an extremal point of K(A) because w = $ (tela | lel.) with 


11.8. Positive linear functionals and convexity 347 


s := min(||w||,1— ||w||). Conversely, the subsets S(A) and {0} of K(A) are 
extremal in K(A), thus the elements of P(A) U {0} are extremal points of (A). 
Remark that P(A) is generally not weak*-closed in A* even if A is unital. 
We now establish that there are “enough” pure states with the following 
analogue of Theorem 11.20. 


Theorem 11.28. Let A be a C*-algebra and denote P := P(A). Then, for each 
LEA: 


(i) if « is normal, then ||a|| = max, cp |w(x)|; for x arbitrary, ||x|| = 
maxweP ||2|lw; 
(ti) if w(v) =0 for allw EP, then x =0; 
(iti) if w(a) ER for allw € P, then x is selfadjoint; 
(iv) if w(x) €R* for allw EP, thenze Ay. 


Proof. We explained that K(A) = co’°**” (P(A) U {0}). As S(A) C K(A), 
every state of A is the weak*-limit of a net of convex combinations of elements 
of P(A) U {0}. Hence, Parts (ii)-(iv) follow from the analogous ones in 
Theorem 11.20. 

(i) Suppose that x is normal and non-zero. By Theorem 11.20 there exists 
w € S(A) such that |w(x)| = ||z||. Denote A := w(x). The set C := 
{p € K(A); p(x) = A} is non-empty because it contains w. It is evidently weak*- 
closed, thus weak*-compact, in A*. It is also extremal in K(A), because if 
w 1,wW2 € K(A) and t € [0,1] are such that tw (x) + (1 — t)we(x) = A, then as 
\w;(a)| < |Ja|] = JA] (¢ = 1,2) we must have w;(x) = A (¢ = 1,2). By the Krein— 
Milman theorem, C' has an extremal point w’. By the foregoing, w’ is also an 
extremal point of (A), namely w’ € P(A) U {0}. But w’ 4 0 because x 4 0 and 
thus A # 0. Therefore w’ is a pure state. In conclusion, ||x|| = max, ep |w(x)]. 
The second assertion in (i) follows as in the proof of Theorem 11.20. 


Next, we characterize those GNS representations associated with pure states. 


Theorem 11.29. The GNS representation 7, associated to a state w of a C*- 
algebra A is irreducible iff w is pure. 


We require the following simple lemma, slightly extending Exercise 15 of 
Chapter 6. See also Lemma 10.12. 


Lemma 11.30. Let X be a Hilbert space and (-,-:): X x X > C a function that 
is linear in the left variable and conjugate linear in the right variable. Assume 
that (-,-) is bounded, that is, there exists M € [0,co) such that 

(2, y)| <M |lallllyll = (Va,y € X) (1) 
(M is then called a bound for (-,-)). Then there is a unique T € B(X) such that 


(t,y) =(Tx,y) — (Va,y € X). (2) 


348 11. C*-algebras 


Moreover, ||T|| < M. Furthermore, T is selfadjoint iff (-,-) is Hermitian, and T 
is positive iff (-,-) is positive semi-definite (i.e., a semi-inner product on X ). 


Proof. Fora given y € X, the linear functional (-, y) on X is bounded by M ||y|| 
according to (1). Therefore, by the “Little” Riesz representation theorem, there 
is a unique Sy € X such that 


(x,y) = (x, Sy) (3) 


for all x € X. This construction yields a unique function S : X — X that satisfies 
(3) for all x,y, which is linear since (-,-) is conjugate linear in the right variable, 
and is bounded by M because ||Sy||, = ||(-,y) || x« <M |ly|| for all y € X. Thus, 
T := S* satisfies (2) (which uniquely determines it) and also has norm at most 
M. The other assertions are immediate. 


Proposition 11.31. Let X be a Hilbert space and A be a C*-subalgebra of B(X). 
Let ¢ € X and let p € A%, be dominated by the vector functional we|4, that is, 
p < we|a. Then there exists T € A’ := {S € B(X);SA=AS for all Ac A}, 
0<T<T, such that p= Wpr/2¢|A- That is, 


p(A) = (ATY?¢, TC) = (ATCC) (VA A). (4) 


Proof. Assume for convenience that ¢ is cyclic for A, namely that AC = X. 
Consider the normed subspace A¢ of X. The function (-,-) : (AC) x (AC) > C 
given by 
(A1¢, A2¢) = p(ApA1) (Ai, Aa € A) 

is a (well defined) semi-inner product on A¢ bounded by 1. Indeed, on account of 
the positivity of p we have (A¢, A¢) = p(A* A) > 0 for all AG Aas A*AE AL, 
and for all A,, Ag € A, by the Cauchy—Schwarz inequality ((1) in Section 11.6) 
and by the assumption that p < we|., 


|o(A3-A1)] < p(A3A2) (AT AI)? < we(A3A2)"/?w¢ (Ap Ai)? = ||A26| || ArCl 


Consequently, (-,-) extends to a semi-inner product on the Hilbert space X 
bounded by 1. Therefore, by Lemma 11.30, there is an operator T € B(X), 
0<T <I, such that (TE,7) = (€,n) for all €,7 € X. In particular, for every 
Aj, Ag EA, 


For all A, A,, Az € A we thus have 
(TAA1¢, AoC) = p(A3AA1) = p ((A*AQ)* Ai) = ('A16, A* Ao¢) = (ATA1C, A26). 


The fact that AC = X and the boundedness of T and A imply that TA = AT, 
proving that T € A’. In conclusion, (5) now says that for every A;, Ag € A, 


(AZ A1TC, ¢) = (T'A1¢, A2¢) = p(AQA1). (6) 


11.8. Positive linear functionals and convexity 349 


Every element of A is the linear combination of positive elements, so 
span {A5A1; A,, Az € A} = A (in fact, the “span” is not necessary by Exercise 6). 
Thus (6) proves (4). Alternatively, (6) implies (4) by Exercise 16. 


Proof of Theorem 11.29. Recall that z,, is a representation of A on a Hilbert 
space X,, with a unit cyclic vector ¢ := q, for m, such that w = we o m,. 

( = > ) Suppose that 7, is irreducible and let p € A’. be dominated by w. 
Recall that 7,,(A) is a C*-algebra (Theorem 11.14) and define on it a positive 
linear functional p’ by p’(m.,(a)) := p(a) for a € A. It is well defined because by 
the Cauchy—Schwarz inequality and the assumption p < w, for all a € A, 


Ip(a)| = |p(L*a)| < p(U*t)/p(a*a)/? < p(ata)'/? < w(ata)!/? = Ima (a)e| 


(when A is unital; otherwise, use approximate identities). Now, the assumption 
p <w implies that p’ < we|z,(4). So by Proposition 11.31, there exists T ¢ 
Tw(A)’, O< T < J, such that p! = wry2¢|n,(a)- But m2, is irreducible, that is, 
Tw(A)! = CI. If now c € [0,1] is such that T = cl, then for all a € A, 


p(a) = p'(m(a)) = e(m(a)C, 6) = ew(a). 


In other words, p = cw. Thus, w is pure. 
( <= ) Suppose that w is pure. Let P’ € 7,,(A)’ be a projection. Consider 
the positive functional p := wp/¢ om on A. For every a € A, 


p(a*a) = (™,(a*a)P'C, P'C) = (m.(a)P'G, 1a) P'C) = (Pm, (a)¢, P’m.(a)¢) 
< (mu(a)C, Tw(a)C) = (mu (a*a)C, ¢) w(a*a). 


Hence p < w. By assumption, there is c € [0,1] such that p = cw. So for every 
abe A, 


(Plt (a)C, To(b)C) = p(b*a) = cw(ba) = (em, (a)6, (b)C) . 


From ¢ being cyclic for 7, we infer that P’ = cI. Thus, P’ € {0,/}, proving 
that 7,, is irreducible. 


11.8.2 Decompositions of functionals 


Theorem 11.32 (The Jordan decomposition of bounded Hermitian 
functionals). Let A be a C*-algebra and w € A* be Hermitian. There exist 
unique wt,w- € A¥ such that w =wt —w™ and ||w|| = |lwt|| + ||w7|[. 
Proof. We will only prove existence. Assume for simplicity that ||w|| = 1. The 
set 

D:= {p* —p°3p",p- € AX, |lo* || + lo || $5 
is convex in A* by Corollary 11.19. Also, the set (A) is weak*-compact in A* 
and the function f : K(A) x K(A) x [0,1] + A%. given by 


f(p1, 2, t) = tp, — (1 — t) pe (p1,; p2 € K(A),t € (0, 1) 


350 11. C*-algebras 


is weak*-continuous. Thus, the image of f, namely D, is weak*-compact in A”. 
Suppose by contradiction that w ¢ D. By Corollary 5.21 of the strict 
separation theorem, there exists « € A that strictly separates w and D, that 
is: 
sup Rp(x) < Rw (x). 
pED 
Since w and all elements of D are Hermitian, one can replace x by $(a +2*), 
and thus assume that 2 € As, and 
sup p(x) <w(e). (7) 
pED 
Since S(A), —S(A) C D and z is selfadjoint, Theorem 11.20 implies that ||z|| = 
maxpes(A) |P(x)| < sup,ep p(x). This contradicts (7) because w(x) < ||). 
Hence there exist wt,w7~ € A%. such that w = wt—w and ||wt||+||w7 || < 1. 
By the triangle inequality, 


1 = |lull < |e" || + [lor] <1, 


so we have the desired equality ||w|| = ||w*|| + |lw~ ||. 


An immediate consequence is that every element of A* is the linear 
combination of (at most) 4 states. 
We close this section with the following theorem, which we give without proof: 


Theorem 11.33 (The absolute value of bounded functionals). Let w € 
A*. There exists a unique positive functional |w| € A*_ such that |||w||| = ||w]| 
and 


lo(e)? < lll] lol(w*x) (Wa € A). 


Exercises 


Unless otherwise indicated, A,6 etc., denote C*-algebras. Do not use 
the non-commutative Gelfand—Naimark theorem (11.26) unless asked, 
although it can simplify some of the solutions. 


1. Let {A;};<, be a family of C*-algebras. The 1°°-direct sum of {Ai}je7, 
denoted by ee Ai, is the set of all elements {a;},-, of the set-theoretic 
direct product [][,-<; Ai whose supremum norm ||{a;};<7|| := suPjez ||@i|| 4; 
is finite. The co-direct sum of {Aj};<7, denoted by 72, @Aj;, is the subset 
of the 1°°-direct sum consisting of all {a;},-, that vanish at infinity in the 
sense that for each € > 0 there exists a finite set F C I such that ||a;||_4, < € 
fori € I\F. 


(a) Prove that )-, @A; and 7, @.A; are C*-algebras with respect to 
the pointwise *-algebra operations and the supremum norm, and the 
latter algebra is an ideal in the former. 


(b) For a set I, what are the C*-algebras se ®C and >7;2, @C? 


Exercises 351 


2. Let y: A > B bea *-homomorphism. Show that y maps A+ into 6, and 
the open unit ball of A into that of B. Show that if y(A) = B, then “into” 
can be replaced by “onto” in both places. 


3. Suppose that A is unital and u € G(A). Prove that u is unitary iff ||u|| = 
1 = ||u~||. (Compare Exercise 24 of Chapter 7.) 


4. The absolute value of a € A is the element |al := (a*a)? € Ay. Prove that 
\||a||| = |la||, and that if a is selfadjoint, then ja] =at+a~. 


5. Let x € A anda € A, be such that x2*x < a. Let 0 < a < 4. We will 
prove that there exists v € Aa := {ya;y € A} C A such that x = va*. 


(a) As a “warm-up”, explain why the assertion is clear if a is invertible. 

(b) For n EN, let vp, := a(at+ lq)" 23° (the computations are in A*). 
Verify that un, € Aa. 

(c) Explain each of the following steps, where b,, := (a+ 14)-2 forn € N: 


IlUp — Umn||? = | a2—° (By — Byn)2*2(By — by )a2-* 


a2~“(by — bm )a(by, — bm )a2—° 


a2 (Dn - bm )a2—% 


0. 


n,m—oo 


Consequently, the limit v := lim, +. Un exists in A. 


(d) Explain each of the following steps: 


lz — vna"||? = | (1 - b,a?)x*x(1 - bya?) 


< | (1 — bya? )a(1 — b,a?) 


Conclude that xz = va", as desired. 


6. Let x € Aand 0 <a < 1. Use Exercise 5 to prove that there exists v € A 
such that x = v|a|®. 


7. Prove that a separable C*-algebra has an approximate identity that is a 
sequence. 


8. Let I be an ideal of A and J an ideal of J. Prove that J is an ideal of A. 


9. A C*-subalgebra B of a C*-algebra A is called hereditary if for every a € A 
and b € B, if0 <a<bthena € B. Prove that every ideal I of a C*- 
algebra A is hereditary. (Hint: let a € A and b € I be such that 0 <a<b. 


352 


10. 


11. 


12. 


13. 


14. 


11. C*-algebras 


For an approximate identity {e,,} for I, use the C*-identity and positivity 
to show that lim, a!/?e, = a!/?, which implies that a € I.) 


Let I be an ideal of A. Let a € A andi € I, be such that a*a < i. Prove 
that a € I. (In particular, for a € A, this implies that a € I iff a*a € I. 
This also yields another proof that J is hereditary.) (Hint: Exercise 5.) 


Let {B;} be a family of C*-subalgebras of A whose union B := L), B; isa 
dense (*-) subalgebra of A. We will prove that for every ideal I of A, the 
intersection 1M B =U, (1 B;) is dense in J. 


(a) Let J := IB. Prove that J is an ideal of A (contained in I). 

(b) Let gq: A> A/J and Q: q(A) > @(A)/q(Z) be the quotient maps. 
Prove that for each 7, g(Z) N q(Bi) = {0}, so that Q|q(g,) is injective. 
(Hint: JN B; = IM B;.) 

(c) Prove that Q is isometric and deduce that J = I. 


It is easier to solve this question for unital A first. 


(a) Prove that for every a € A and every state w € S(A) we have |w(a)| < 
w(lal?)2. 

(b) Let a € As and w € S(A). Assume that w(a?) = w(a)?. Prove that 
w(ab) = w(a)w(b) = w(ba) for all b € A. 

(c) Let a € Asa. Assume that for all 0 4 b € A;, there is w € S(A) such 
that w(a”) = w(a)? and w(b) £0. Prove that a belongs to the center 
{x € A; xy = yx for every y € A} of A. 


Suppose that p € A is a projection (i.e., a selfadjoint idempotent: p? = p = 
p*) that is not a unit for A. Prove the existence of a state w of A such that 
w(p) = 0. (Hint: there is 0 4 a € A such that ap = 0. In fact, one can find 
such a that is positive and of norm 1. Then a(1—p) = a = (1—p)a (where, 
as usual, fl is the unit of A*), thus a = (1—p)a(1—p) < (1—p)1(1—p) = 
(1 —p), and hence 0 << a+p< 1. Asa result, taking w to be a state of A 
with w(a) = 1, we must have w(p) = 0.) 


We know that if A is unital, then S(A) is weak*-closed in A*. Let us 
prove the converse: if A is not unital, then S(A) is not weak*-closed in 


A*. To this end, we show that 0 € S(Ay 
(a) Prove that for each « € A, and € > 0 there is w € S(A) such that 
w(x) < € by considering the following complementary cases. 
Case 1: 0 is an isolated point in o(x). Use the operational calculus 
to show that x is dominated by a positive multiple of a projection in 
A. Then use Exercise 13. 
Case 2: 0 is an accumulation point in o(a#). By Theorem 11.20 (vi), 
it suffices to replace A by its C*-subalgebra generated by x. Thus, by 


Exercises 353 


15. 


16. 


17. 


18. 


19. 


20. 


Exercise 30 in Chapter 7, we can assume that for some compact subset 
K C C containing 0 as an accumulation point, A is the C*-subalgebra 
{f ¢ C(K); f(0) = 0} of C(K) and z is the identity function. 


— k* 
(b) Use Part (a) to show that 0€ S (A , either by the bare definition 
of the weak*-topology combined with positive functional techniques, 
or using separation. 


Use Exercise 14 to show that if A is not unital, then co’ (S(A)) = 
co(S(A) U {0}) = {w € A¥; ||w|| < 1} (= K(A) of Section 11.8). 


Let a be a representation of A on a Hilbert space X. Prove that the 
following conditions are equivalent: 


(i) a is non-degenerate; 


(ii) for every approximate identity {e,} of A and every 7 € X we have 
limyg 7(€a)7 = 7; and 


(iii) for some approximate identity {e,} of A and every 7 in some dense 
subspace of X we have lim, 7(€a)7 = 7. 


In this exercise we fill the missing details in the proof of the GNS 
theorem (11.23) in the non-unital case. Let A be a non-unital C*-algebra 
and w a state of A. 


(a) Suppose that a is a representation of A on a Hilbert space X and 
¢ € X isa unit vector such that w = woom. Let {e,} be an approximate 
identity for A. Show that lim, m(ea)¢ = ¢. 

(b) Let @ be the (unique) extension of w to a state of A* 
(cf. Theorem 11.20). Apply the GNS construction to (A*,@) to get a 
representation 7 of A* on a Hilbert space X and a unit cyclic vector 
¢ for # such that © = we o 7. Prove that 7 := 7|, is a representation 
of A on X with ¢ as a cyclic vector (which clearly satisfies w = weor). 

(c) Let U be as in the uniqueness part of the statement of the GNS 
theorem. Show that indeed U maps ¢ to 7. 


Recall that P(A) is the set of pure states of A. 


(a) Prove that (A) C P(A). (Hint: for ¢ € &(A), consider ker ¢.) 


(b) Prove that if A is commutative then (A) = P(A). 
(Hint: Exercise 17 of Chapter 7.) 


A positive functional w € A’, is called faithful if for every a € Az, w(a) = 0 
implies that a = 0. Prove that the GNS representation associated with a 
faithful state is faithful (i-e., injective). Remark that the converse is false. 


Let n € N. Consider the matrix algebra M,,(A) of n x n-matrices with 
coefficients in A. It becomes a *-algebra with the involution given by 


354 


21. 


22. 


23. 


24. 


25. 


11. C*-algebras 


[is ins ee | "= (Ger<agen’ Prove that there is a (necessarily unique!) 
norm on M,,(A) making it a C*-algebra. (Hint: let X be a Hilbert space. 
Construct a natural *-isomorphism between M,,(B(X)) and B(X®”), 
where X®” is the Hilbert space direct sum of n copies of X, and deduce 
that there is a norm on the *-algebra M,,(B(X)) making it a C*-algebra. 
Finally, use the non-commutative Gelfand—Naimark theorem.) 


We continue Exercise 20. 


(a) Let n EN, X be a Hilbert space and (Tij).<; ;<n = T € Mn(B(X)). 
Prove that T is positive iff for every n vectors %1,...,%, in X we have 
eae x5) > 0. 

(b) Assume that A is unital. Prove that for a € A with |la|]| < 1, the 
matrix ( 4) is positive in M2(A). 


(a) Using the notation of Sections 11.7 and 11.8, prove that 


at 


weP(A) 


is a faithful representation of A. (Notice the difference between this 
and the non-commutative Gelfand—Naimark theorem.) 


(b) What form does this representation take if A is commutative? 


Let 7 be a non-degenerate representation of A on a Hilbert space X. 


(a) Prove that there exists a subset Z C X consisting of unit vectors such 
that X = Di-ez Om(ANC. 

(b) For every ¢ € Z, let m¢ be the representation of A on the Hilbert 
space X¢ := m(A)¢ given by reducing m(a) to X¢ for each a € A 
(notice that m¢ is indeed a well-defined representation!). Prove that 
T = Yicez Sm, and deduce that up to unitary equivalence, 7 = 
Yicez ©T,, where we used the notation of the GNS theorem. (Hint: 
Example 11.25.) 


Prove that each non-degenerate representation of A is a sub-representation 
of the universal representation of A. (This is the essence of its universality. ) 


Prove that for each w € A* there exist a representation 7 of A on a Hilbert 
space X and vectors ¢,7 € X such that w(x) = (r(x)¢,7) for alla € A 
and ||w|| = ||¢|| ||7||. (Hint: for simplicity, assume that ||w|| = 1. Apply the 
GNS construction to the absolute value |w| of w (see Theorem 11.33) to 
obtain (X := X\),7 := Tuj,¢ -= Gu). Prove that the map from 7(A)¢ 
to C given by 7(x)¢ > w(x), x € A, is a (well defined) linear functional 
of norm at most 1.) 


12 


Von Neumann algebras 


In Chapter 11 we studied C*-algebras. This chapter is devoted to a particular 
class of C*-algebras called von Neumann algebras after John von Neumann who, 
together with Francis Joseph Murray, laid the foundations of this theory in a 
series of long papers of extraordinary depth and insight. A von Neumann algebra 
on a Hilbert space H is a *-subalgebra of B(H) containing the identity I and 
closed in the weak/strong operator topology. This is more strict than (concrete) 
C*-algebras acting on H, which are only asked to be closed in the (operator) 
norm topology. 

After introducing the chapter, we present the two great foundations of von 
Neumann algebra theory. The first is von Neumann’s double commutant theorem. 
Published in 1930, it says that a «subalgebra of B(H) containing I is a von 
Neumann algebra—namely, it is weak operator closed—iff it equals its double 
commutant in B(H). This connects an analytic property (w.o.-closedness) with 
an algebraic property (being equal to the double commutant). Kadison called 
this magnificent result “the first theorem of the subject of operator algebras”, 
and practically the whole theory of von Neumann algebras depends on it. 

The second is Kaplansky’s density theorem, which states that for a von 
Neumann algebra R and a w.o.-dense *-subalgebra A of R, every element 
T € R can be approximated in the w.o.-topology by elements of A with norm 
at most ||T||. Being able to control the norms of the approximating elements 
is a profoundly useful tool. G. K. Pedersen wrote: “The density theorem is 
Kaplansky’s great gift to mankind”. 

It is noteworthy that the proofs of these two theorems use “matrix tricks”. 
Over the years such elegant tricks have become common in operator algebras. 

The next short topic is the polar decomposition of operators. In particular, we 
prove that if the original operator belongs to a von Neumann algebra, then so do 
the two factors of its decomposition. See Exercise 7 for the polar decomposition 
of unbounded operators. 

The definition of von Neumann algebras is “concrete”, that is, they consist 
of bounded operators on a Hilbert space. There is also an “abstract” definition 


Introduction to Modern Analysis. Second Edition. Shmuel! Kantorovitz and Ami Viselter, Oxford University Press. 
© Shmuel Kantorovitz and Ami Viselter (2022). DOI: 10.1093/os0/9780192849540.003.0012 


356 12. Von Neumann algebras 


via Dixmier and Sakai: a C*-algebra R is isomorphic to a von Neumann algebra 
iff it is dual to some Banach space FR,, called a predual of R. Such C*-algebras 
are called W*-algebras. Moreover, in this case, the Banach space R, whose dual 
is R is unique. The elements of R., viewed as functionals in R*, are called 
normal. 

We then make a short detour through two classes of bounded operators 
on Hilbert spaces, namely Hilbert-Schmidt and trace-class operators, which 
are interesting in their own right. We use the latter class to provide another 
description of the preduals of von Neumann algebras. 

In the introduction to Chapter 11 we described C*-algebra theory as “non- 
commutative topology”. Along this line, von Neumann algebra theory should be 
considered as “non-commutative measure theory”. One reason for this is that the 
commutative von Neumann algebras “are” the L° algebras of measure spaces; 
see the text for the precise statements. 

For every C*-algebra A, the double dual A** is shown to become a von 
Neumann algebra with a universality property related to the representations of 
A. It is called the enveloping von Neumann algebra of A. 

Finally, note that projections have been omitted from this chapter except 
for places where they are essential. Projections are everywhere in von Neumann 
algebras and most of the theory revolves around them one way or the other. 
Doing them justice would require a book more focused on operator algebras. We 
thus mention them here only briefly. 


12.1 Preliminaries 


In contrast to previous chapters, here and in Chapter 13 we denote Hilbert spaces 
by H,K, etc., and reserve X,Y for other purposes. 

Let H be a Hilbert space and B(H) denote the algebra of all bounded 
operators on H. Recall from Section 6.6 that the strong (respectively, weak) 
operator topology on B(H), denoted by s.o.t. (w.o.t.), turns B(H) into a locally 
convex t.v.s. where a net {T.}q converges to T iff Tyx = Tx strongly (weakly) in 


H for all x € H (the latter means that (Tax, y) — (Tx, y) in C for all x,y € H). 


A few basic facts are: 


¢ The norm topology on B(H) is finer than the s.o.t., which in turn is finer 
that the w.o.t. 


e A convex subset of B(H) is s.o.-closed (to be read: strong operator closed, 
i.e., closed in the s.o.t.) iff it is w.o-closed (Corollary 6.20). 


e¢ The multiplication function B(H) x B(H) 3 (T,S) ~ TS € B(H) is 
not (jointly) s.o.-continuous, and is also not w.o.-continuous even when 
restricted to the unit ball of B(#) cross itself, unless H is finite dimensional. 
However, it is s.o.-continuous when restricted to bounded sets (Exercise 17 
in Chapter 8). 


12.1. Preliminaries 307 


¢ The (Hilbert) adjoint function T + T* on B(H) is w.o.-continuous, but not 
s.0.-continuous, even when restricted to the unit ball of B(#) unless H is 
finite dimensional. 


¢ The (norm-) closed unit ball of B(H) is w.o.-compact (Exercise 18 of 
Chapter 8). 


A von Neumann algebra on H (or acting on H) is a w.o.-closed «subalgebra 
of B(H) containing the identity I of B(H). Note that every linear subspace, and 
in particular, every subalgebra, of B(H) is convex, so it is w.o.-closed iff it is 
s.o.-closed. 

Evidently, B(H) itself is a von Neumann algebra on H. 

Every von Neumann algebra is a C*-algebra, but the converse is not true. 
Observe that in contrast to C*-algebras, von Neumann algebras are usually 
defined “concretely”, as algebras of bounded operators on some Hilbert space. 
An equivalent “abstract” definition of von Neumann algebras, as the class of 
C*-algebras having an additional property, is covered in Section 12.5. 

Two von Neumann algebras 1,72 on Hilbert spaces Hi, H2, respectively, 
are spatially isomorphic if there exists a unitary equivalence U : H, — Hz such 
that UR,U* := {UTU*;T © Ri} equals Ro. This is evidently stronger than 
R1,R2 being isomorphic as C*-algebras, namely *-isomorphic. 

We begin with a few elementary results. Let H be a Hilbert space. 


Proposition 12.1. If {Ag}, is an increasing net in B(H)sa, bounded from 
above by some B € B(H)sa, then A := lim, Aq exists in the s.o.t., and it is the 
least upper bound of {Aq} in B(H)sa- 


Proof. We are given that A, < B for every a. We may assume without loss of 
generality that C < A, for some C € B(H)sa, and by adding —C’, we may assume 
that A, > 0 for every a. The net {A,}, is thus uniformly bounded by ||B\|. 
For every x € H, the non-negative net {(Agz,x)},, is increasing and bounded, 
so it is convergent. By the polarization identity, the limit (a, y) := limg (Aaa, y) 
exists for every x,y € H. Now (-,-) : H x H —> C is a semi-inner product 
that is bounded by ||B||, that is, |(a,y)| < ||B]| |||] ||y|| for all z,y © H. By 
Lemma 11.30 there is A € B(H)4 such that (Az,y) = lim, (Ag2,y) for all 
x,y EH. As (Aga, x) 7 (Az, x) for all  € H, we have A, < A for every a. 
Hence, for each x € H we have 


(A Aa) '/?a|" = ((A— Aa), 2) = 0, 


that is, (A—A,)'/? — 0 in the s.o.t. Since {(A — Aq)'/?}., is uniformly bounded 


(why?), we deduce that A — Ag = [(A- Ay?) — 0 in the s.o.t. If A’ € 


B(H)sa is so that A, < A’ for every a, namely, (Agz,x) < (A’x,x) for every 
x EH, then (Az, x) < (A’a, x); that is, A< A’. 


Recall from Chapter 11 that by a projection we mean “orthogonal projection” 
(Section 8.2). 


358 12. Von Neumann algebras 


Corollary 12.2. Let R be a w.o.- (equivalently, s.o.-) closed *-subalgebra of 
B(H). Then R has a unit, which is the projection of H onto span{Tx;T € 
R,x EH}. 


Proof. Since R is a C*-algebra, it has an approximate identity {Pa}, by 
Theorem 11.8. By Proposition 12.1, {P.}, has an s.o.-limit, say P € R, which 
satisfies 0 < P < J. For every T € R, PT is the s.o.-limit of the net {P.T},, 
which converges in norm to T. Thus PT = T, and similarly TP = T. Hence, P 
is a unit for R, and in particular a projection. 

Since P is the s.o.-limit of a net in R, its range is evidently contained in 
Ho := Span{Tx;T € R,x € H}. Conversely, the fact that PT = T for all TER 
shows that Ho is contained in the range of P. 


Corollary 12.2 implies that we do not lose anything by requiring that a von 
Neumann algebra on # should contain the identity of B(H) by definition. It 
also implies that for every «subalgebra R of B(H) that is non-degenerate, that 
is, Span{Tx;T € R,x € H} =H, the closure Re’ — Ro is a von Neumann 
algebra on H. 


Theorem 12.3. Let R be a von Neumann algebra on H. 


(i) If T € R is normal and E is the resolution of the identity for T, then 
E(o) € R for every Borel subset o of C. 


(ii) For every T € R, the projections onto kerT and TH belong to R. 


(iti) The set of all projections in R spans a norm-dense subspace of R. 


Proof. (i) Let ¢ C C be compact. It makes no difference to assume that 0 C 
o(T). By Urysohn’s lemma, there is a bounded sequence {f,}92, in C(o(T)) 
converging pointwise to the indicator y, of o. Then f,(T) € R for every n, and 
fr(T) > xo(T) = E(c) in the s.o.t. by Theorem 9.6. Therefore E(c) € R. 

If o C C is a general Borel set, consider the directed set K of all compact 
subsets K of o ordered by inclusion. Then E(/K) € R for all K € K by the 
foregoing, and limxex E(K) = E(o) in the w.o.t. by the regularity of E. In 
conclusion, E(a) € R. 

(ii) If T € R is normal, then the projections onto ker T and TH are E({0}) 
and E(C\ {0}), respectively (see Theorem 9.8 and Exercise 16 in Chapter 9), 
which belong to R by (i). If T is general, then ker T = ker |T| and TH = |T*|H 
(see the text preceding Theorem 12.11) are in R. 

(iii) Every element in R is a linear combination of two selfadjoint elements 
in R. By the spectral theorem, every selfadjoint operator is the (norm!) limit of 
linear combinations of its spectral projections, which belong to R by (i). Indeed, 
if T € Resa has resolution of the identity EF, then the identity function on o(T) 
is the uniform limit of a sequence {f,}°2, of simple measurable functions on 
o(T), and so every f,(T) is a linear combination of projections from {E(a);a0 C 
C Borel} and f,(T) > T in norm. 


Part (iii) of Theorem 12.3 shows that projections are plentiful in von 
Neumann algebras. This is in sharp contrast to C*-algebras, in which projections 


12.2. Commutants 359 


may be scarce. For instance, the only projections in C([0,1]) are zero and the 
unit element; and there are even simple unital C*-algebras with this property! 

We conclude this section with direct sums of von Neumann algebras. Recall 
from Section 8.6 the construction of direct sums of Hilbert spaces and operators 
on them. Let {Ra} je, be a family of von Neumann algebras acting on Hilbert 
spaces {Ha} ey, respectively. The set of all direct sums of operators of the 
form Yea Ta, where Ty € Raq for every a € A, is called the direct sum of 
the von Neumann algebras {Ra} e4 and is denoted by }),-4 Ra. It is not 
difficult to observe that }),-4®Ra is a von Neumann algebra on the Hilbert 
space )),c4 PHa- 


12.2 Commutants 


This section presents the double commutant theorem of von Neumann. This 
spectacular result is the cornerstone of the entire deep theory of von Neumann 
algebras, and it cannot be overstated. It says that for particular subalgebras 
of B(H) the algebraic property of being equal to the double commutant is 
equivalent to the analytic property of being w.o.-closed. 

Let H be a Hilbert space and R C B(H). Recall that the commutant of R 
(in B(H)) is the set 


R' :={T € B(H);TA= AT for all A€ R}. 


The commutant FR’ is a subalgebra of B(H) containing J. It is also w.o.-closed. 
Indeed, if the net {Tj}j;e7 C R’ converges to the operator T € B(H) in the 
w.o.t., then for all z,y € H and AE R, 


(TAz, y) = lim(T; Az, y) = lim(ATjz, y) 
J Yi) 
= lim(Tjz, A*y) = (Tx, A*y) = (ATz, y), 
J 


hence, TA = AT for all A € R, that is, T € R’. 
Also, if R is selfadjoint (i.e., T* € R whenever T € R), then so is R’. In this 
case, R’ is a von Neumann algebra on H. It follows that the second commutant 


R" = (R'Y 


is a von Neumann algebra containing R (trivially). In particular, the relation 
R = R" implies that R is w.o.-closed. We show below that the converse is also 
true provided that R is a *-subalgebra, that is, a selfadjoint subalgebra, of B(H). 


Theorem 12.4 (Von Neumann’s double commutant theorem). Let R 
be a x-subalgebra of B(H) containing I. Then R is a von Neumann algebra 
(equivalently, is w.o.-closed) iff R = R”. 


We will make essential use of Lemma 11.27, by which when R is a *-subalgebra 
of B(H), a projection P € B(H) whose range has the form RS, where S C H 
and RS := span{Tz;T € R,x € S}, belongs to R’. 


360 12. Von Neumann algebras 


Another ingredient in the proof is the following. Let n € N and consider the 
Hilbert space H®”, the direct sum of n copies of H. The *-algebra B(H®") is *- 
isomorphic to the matrix *-algebra M,,(B(H)) of n x n-matrices with coefficients 
in B(H) with the involution [ene = OT censee In particular, two 
operators on H®” commute iff their associated matrices commute. 

For S € B(H), let S$ be the operator on H®” with matrix diag(S,...,9) 
(the diagonal matrix with S on its diagonal). That is, $() is the direct sum of n 
copies of S' (see Section 8.6). Then S(™ € B(H®") and clearly ($(™)* = (S*)™, 

If R is as in the theorem, R\™ := {A™; A € R} is a *subalgebra of B(H®”) 
containing the identity operator. One verifies easily that for S € B(H®") = 
M,,(B(H)), S € (R™) iff its associated matrix has all entries $;; in R’. 


Proof of Theorem 12.4. By the comments preceding the enone of 
Theorem 12.4, we must only show that if T € R”, then T € R= R . 
that is, each s.o.-basic neighborhood N(T,F,e«) = {S € B(H);||(S — T)a|| < 
e for all x € F} of T meets R. Recall that F is an arbitrary finite subset of H, 
say F = {a,...,%n}, and e > 0. 

Consider first the special case when n = 1. Denote Ra, := R{x1} and let P 
be the projection onto the closed subspace Ra, of H. By Lemma 11.27 we have 
P€R’. Therefore TP = PT since T € R”. In particular, Rx, is T-invariant. 
But 2, = Ia, € Ray, so Tx; € Rx ,. Thus, there exists A € R such that 
||T'xr1 — Axy|| < €, that is, A € N(T, {x1}, €), as wanted. 

The case of arbitrary n is reduced to the case n = 1 using n x n-matrices. Let 
T € R” and consider as shown the s.o.-neighborhood N(T, {a1,...,%n},6). Set 
x := [21,.--,2n] € H®”. Then T™ € (R(™)" by the last paraseph before the 
proof, and fhercione: by the case n = 1, the s.o.-neighborhood N(T™), {a}, €) in 

B(H®") meets R ) that is, there exists A € R such that 


(7 — A™)z| < e. 
Hence, for all k =1,...,n 


1/2 


| —A)zall < | DOME Ades? | = I(T - A)™all <, 
j=l 


that is, Ae N(T,{a1,...,¢n},6). 


An equivalent statement of the double commutant theorem is: if R is a non- 
degenerate *-subalgebra of B(H), then R” * i R", 

We now show an application of the double commutant theorem involving 
cyclic and separating vectors. Let # be a von Neumann algebra on a Hilbert 
space H. A vector x € H is called cyclic for R if Ra = H, and it is called 
separating for R if for every T € R, T = 0 if (and only if) Tx = 0. 

Suppose that x is cyclic for R, and let T’ € R’ be such that T’x = 0. Then 
for every T € R we have T’/Tx = TT'x = 0, and since = is cyclic for R and T’ 
is bounded we infer that T’ = 0. In other words, x is separating for R’. 


12.2. Commutants 361 


Conversely, if x is separating for R’, let P’ be the projection of H onto its 
closed subspace Rx. Then P’ € R’ by Lemma 11.27, and evidently P’x = x. So 
I- P’ € R’' and (I — P’)x = 0. Since x is separating for R’, we conclude that 
I =P’, that is, Ra = H. That is, x is cyclic for R. 

We proved the first statement of the next proposition. The second one follows 
by applying the first to R’ and using the double commutant theorem. 


Proposition 12.5. Let R be a von Neumann algebra. A vector is cyclic for R 
iff it is separating for R’, and it is separating for R iff it is cyclic for R’. 


To finish this section, we use commutants to give a simple example of 
commutative von Neumann algebras. We use the following observation. 


Lemma 12.6. Let (X,A,) be a finite positive measure space, and g € 
L?(X,A,p) be such that there is M < co with ||fg|l, < M||f\l, for every 
f € L°(X,A,p). Then actually g € L°°(X, A, p) and |lgl|,, < M. 


Proof. Let M’ > M and consider the set E := [|g] > M’] € A. Then the 
indicator function f := xz belongs to L°(X,.A, 4), and by assumption, 


M'u(E)} < (f lg(2du)? = |Ifally < Mllfllp = Mu(B)?. 


This is possible only if u(#) = 0. Thus, ||gl|,, < M. 


Example 12.7 (Compare Example 11.21). Let (X,A,j) be a positive 
measure space. For each f € L~(X,A, 4), consider the multiplication operator 
My € B(L?(X,A,p)) given by g > fg, g € L?(X,A, pu). The reader can verify 
easily that this operator is well defined and ||M || < ||f||,,. Furthermore, the map 
f — My is an injective *-homomorphism from L°(X,A,) to B(L?(X, A, 1). 
We now prove that if (X,A,,) is o-finite, then R := {My; f © L©(X,A,p)} 
is a commutative von Neumann algebra on L?(X,A, 1). Since the C*-algebras 
L™(X,.A, 4) and R are (*-) isomorphic, it is colloquially customary to say that 
L@(X,A, ) “is” a commutative von Neumann algebra in this case. 

As R is commutative, we have R C R’. Let T € R’. Since (X, A, yu) is o-finite, 
we can write X =~, Xn, where {X,,}>_, is a sequence of mutually disjoint 
sets in A, each of which has finite measure. Let gn := T(vx,) € L?(X,A, 1). 
Then for every f € L©(X, A, p), 


T(fxx,) =TMyxx, = MsT xx, = fon- 


Taking f := xx, we get gn = T(xx,,) = XX,9n; 8O Gn is zero (a.e.) on XE. 
Moreover, ||f9nllz = IZ(fxx, lle < WTI MFxx, ll, for every f € L°(X,A, 1). 
Consequently, by Lemma 12.6 applied to the “restriction” of (X,A,) to X, we 
get gn € L©(X,A, pm) and |lgn||,, < ||T||. Let g ¢ L°(X,A,) be defined by 
9 = Gn on Xp, NEN. If f € D(X, A, pw) then for all n € N, 


T(fxx,) = fon = fXx,9 = M,(fxx,)- 


362 12. Von Neumann algebras 


Since span{fxx,;f € L°(X,A,u),n EN} is dense in L?(X,A,p)_ by 
Theorem 1.27, we infer that T = Mg. In conclusion, R’ C R, hence R = R’. 
This means not only that R is a von Neumann algebra on L?(X, A, 1), but that 
it is a maximal commutative von Neumann subalgebra of B(L?(X, A, 11). 


12.3. Density 


If R is a normed space and A is a dense subspace, then evidently every element of 
the closed unit ball of R can be approximated in norm by elements of the closed 
unit ball of A. One can ask whether this holds for algebras of operators when 
the operator norm topology is replaced by the weak/strong operator topologies. 
This turns out to be true when the algebras in question are selfadjoint. 

The following theorem bears some resemblance to Goldstine’s theorem (5.25), 
but their contexts and proofs are completely different. 


Theorem 12.8 (Kaplansky’s density theorem). Let R be a *-subalgebra of 
B(H) for a Hilbert space H and A an s.o.-dense *-subalgebra of R. Then the 
(norm-) closed unit ball of A is s.o.-dense in the closed unit ball of R. 


We first prove the theorem for selfadjoint operators. 


Proposition 12.9. The set of selfadjoint elements in the closed unit ball of A 
is s.o.-dense in the set of selfadjoint elements in the closed unit ball of R. 


Proof. Since the closed unit ball of R is norm-dense in that of Rill and the same 
is true for A, we may assume that R,A are norm-closed, namely C*-algebras. 
Moreover, note that As, is s.o.-dense in Rs_. Indeed, for every T € Rega, let {Ta} 
be a net in A that converges to T in the s.o.t. Then the net {4[T'4 + (Ta)*]} 
in Asa converges to T in the w.o.t., and therefore T’ € Wa But Le = 


—s.o.t , : — s.0.t 
Asa since A, is convex, so that T € Aga 


Define f : R-> R by f(A) := eer ER. Then f is continuous on R and 
f(R) = [-1,1]. We claim that f is “s.o.-continuous on R,,”, namely, if a net 
{T..} in Rsa converges to T € Rsga in the s.o.t., then the net {f(T,,)} converges to 
f(T) in the s.o.t. Indeed, for every a we have || f(Ta)|| = maxyeor,) |f(A)| < 1 


and 


5(F(Za) ~ (2) = Tall + 72)2 T+ 724 
= (1+ 72)7* [To +77) — (1+ 72)T] (1+ 77)" 
= (F423) (Ta —T) + (ToT ~ TaD) E+ 7?) 


= E+ 12) MTq — TVE+ 19) + F(ToN(E — Ta) ft). 


Now T, — T in the s.o.t. and the nets {(I + T2)~'}, {f(Zu)} are bounded (by 


a 


1), so f (Za) ms f(T) in the s.o.t. 


12.4. The polar decomposition 363 


Let S € Resa, ||S'|| < 1. The continuous function f is strictly increasing from 
—1 to 1 on [—1,1]. Thus, it has a (continuous) inverse there, say, g : [—1, 1] > 
[—1,1]. Let T := g(S) € Resa. Then S = f(T). By assumption, there is a net 
{Tq} in Asa that converges to T in the s.o.t. By the foregoing, {f(Tq)} is a net 
in As, that is bounded by 1 and converges to f(T) = S in the s.o.t. 


We now use a 2 x 2-matrix trick to upgrade the selfadjoint case to the general 
case. Let Mz(R) denote the *-subalgebra of Mz(B(H)) =~ B(H®?) consisting of 
all matrices S = (Sis rcs jee with S,;; € R for each 1 < i,j < 2. Doing the same 
for A, we see easily that M2(A) is an s.o.-dense *-subalgebra of Mo(R). 


Proof of Theorem 12.8. Let S be in the closed unit ball of ®. The operator 
S € Mo(R) given by $ := ( f 3) is selfadjoint and | S|] = ||S|| < 1. By the 
proposition, there is a net iT of selfadjoint elements of the closed unit ball 
of M2(A) converging to § in the s.o.t. In particular, {(T,)12} converges to 
(S),. = in the s.o.t, and ||(To)12|| < Tall < 1 for every a. 


12.4 The polar decomposition 


Every z € C has its (unique) polar decomposition z = »|z| for some X in the 
complex unit circle. The polar decomposition of bounded operators on Hilbert 
spaces, which we prove in this section, is a far-reaching generalization of this. A 
further extension to unbounded operators is obtained in Exercise 7. 

Let H denote a Hilbert space. 

An operator V € B(H) is called a partial isometry if for some closed subspace 
M of H, V acts isometrically on M (ie., ||Va|| = ||x|| for every x € M) and 
V|act = 0. The closed subspace M is called the initial space of V, and the closed 
subspace V(M) is called the final space of V. 

Projections, unitaries, and isometries are examples of partial isometries. 

The proof of the following lemma is left to the reader. 


Lemma 12.10. Let V € B(H). 


(i) V is a partial isometry iff V* is a partial isometry. In this case, the initial 
and final spaces of V* are the final and initial spaces of V, respectively. 

(ii) V is a partial isometry iff V*V is a projection iff VV* is a projection iff 
V*VV* =V* iff VV*V =V. In this case, V*V, VV* are the projections 
on the initial and final spaces of V, respectively. 


Recall that the absolute value of an element a of a C*-algebra A is |a| := 
(a*a)'/€ Ax. 

Let T € B(H). The support of T is the closed subspace (ker T)+ = T*H of 
H. The support of a selfadjoint operator clearly equals the closure of its range, 
and the support of a partial isometry is its initial space. 

For every « € H, we have 


||T2\|? = (Lx, Tx) = (T*Tx,x) = (|T|’ 2,2) = |I|T| al”. (1) 


364 12. Von Neumann algebras 


It follows that the operators T and |T| share the same kernel. Thus, they also 
share the same support, which equals |T| H as |T| is selfadjoint (indeed, positive). 


Theorem 12.11 (Polar decomposition). Let H be a Hilbert space and T € 
BCH). 


(i) Existence: there exists a partial isometry V € B(H), whose initial space 
is the support of T and whose final space is TH, such that T = V |T|. 


(ii) Uniqueness: if A € B(H)+ and W € B(H) is a partial isometry with 
initial space AH such that T=WA, then A=|T| andW =V. 


(iti) The operator V of Part (i) also satisfies |T*|=V|T|V* and T =|T*|V. 
(iv) If R is a von Neumann algebra on H and T € R, then V,|T| ER. 


Proof. (i) For every x € H, define V |T|a := Tx. By (1), the resulting map 
V :|T|H — TH is well defined, isometric, and surjective, and so it extends to a 
surjective isometry V : |T|H —> TH. Extend it further to V € B(H) by letting 
V map (|T|H)+ = ker T to 0. Then V is a partial isometry with initial and final 
spaces |T|H = T*H and TH, respectively, and we have T = V |T}. 

(ii) If W, A are as above, then since the initial space of W is AH, we have 
T*T = AW*WA = A? by Lemma 12.10, thus A = (T*T)!/2 = |T| (see 
Section 11.3). So V|T| = W|T|. Hence, W = V, because the initial space 
of both partial isometries is the support of T, which equals |T'| H. 

(iii) We have |T*|? = TT* = V|T|?V* = (V|T|V*)?, where the last step holds 
because the initial space of V is the support of T (= that of |T']). Consequently, 
|[*| =V|T|V*, thus |T*|V =V|Z|V*V =V|T| =T. 

(iv) We already know that |T| € R. By von Neumann’s double commutant 
theorem (12.4), it is left to show that V € R” . Every element in R’ is the 
linear combination of at most 4 unitary elements in R’ because R’ is a unital 
C*-algebra (see Section 11.2). Therefore, it suffices to show that V commutes 
with all unitaries in Fe’. Let U € R’ be unitary. Since T € R, 


T =UTU* =UV|T|U* = (UVU")(U |T|U*) (2) 


(of course, we have U |T|U* = |T|, but that will not be relied on). The operator 
U |T|U* is positive. Since V*V is the projection on the initial space of V, 
which is |T] 2, the operator (UVU*)*(UVU*) = UV*U*UVU* = UV*VU* 
is the projection on U(|T|#H) = (U|T|U*)H. Equivalently UVU* is a partial 
isometry with initial space (U |T|U*)H. By (2) and the uniqueness of the polar 
decomposition, VVU* = V; equivalently, U commutes with V, as desired. 


The decomposition T = V|T| of Part (i) of the theorem is called the polar 
decomposition of T. 


Remark 12.12. Let A Cc B(H) be a C*-algebra and T € A. Let T = V|T| 
be the polar decomposition of T. Then while |T| € A, normally V ¢ A (even if 
I € A). Compare Exercise 6 in Chapter 9. 


12.5. W*-algebras 365 


12.5 W*-algebras 


A dual Banach space is a Banach space Y for which there exists another Banach 
space X such that Y is isometrically isomorphic to X*. Such X is called a predual 
of Y. Not every Banach space is dual: for instance, cp is not a dual Banach space 
(see Exercise 10). Also, if a predual exists, it is not necessarily unique up to 
isometric isomorphism (see Exercise 11). 

Observe that if X is a predual of Y, then X can be viewed as a closed 
subspace of Y*: indeed, if f is an isometric isomorphism of Y onto X*, then the 
map X >a“ —> (Y 3 y—- (f(y))(«)) is an isometric isomorphism of X into Y* 
(it equals f* o Kx, where Kx is the canonical isometric isomorphism of X into 
X** and f* is the Banach adjoint of f). We will identify X with this subspace 
of Y* tacitly. 

A W*-algebra (or an abstract von Neumann algebra) is a C*-algebra R, which 
is a dual Banach space. 

This section revolves around two facts due to Dixmier and Sakai: 


(A) The class of von Neumann algebras is, up to an isomorphism, the class 
of W*-algebras. In other words, every von Neumann algebra is a W*- 
algebra (Theorem 12.14); and conversely, every W*-algebra has a faithful 
representation as a von Neumann algebra (Theorem 12.18). 


(B) Every W*-algebra R has a unique predual. The uniqueness is not only up 
to isometric isomorphism: in fact, all preduals of R are equal as subspaces 
of R* (Theorem 12.17); equivalently, they induce on R the same weak* 
topology. 


We start with one direction of Point (A), proving it first for B(H). 

Let H be a Hilbert space. For z,y € H we define wz, € B(H)* by 
Wa y(L) := (T2,y), T € BCH). In particular, w, = wz,,. The subspace B(H)~ := 
span{wz 3c, y € H} of B(H)* consists precisely of all linear functionals on B(H) 
that are w.o.- (equivalently, s.o.-) continuous (see Theorem 6.19). Denote by 
B(H), the norm closure of B(H)~ in B(H)*. Then B(H), is a Banach space 


Proposition 12.13. For every Hilbert space H, B(H) is a W*-algebra. To 
elaborate, B(H) is isometrically isomorphic to the dual of B(H)., the closure in 
B(H)* of the subspace of w.o.-continuous linear functionals on B(H). 


Proof. There exists a natural linear map V : B(H) — (B(H).)* given by 
W(A) := Alpyy,, that is, (U(A))(w) := w(A) (A € B(H),w € B(H).). We 
shall prove that V is an isometric isomorphism. 

Isometricity. Let A € B(H). Since |(W(A))(w)| < ||w]] || Al] for each w € 
B(H)., we have ||W(A)|| < ||A|]. On the other hand, 


|All] = sup |(Aa,y)| = sup |we,y(A)| 
ell=Hlyll=1 ell=Hlyll=1 


~ ella =2 (Y(A)) (Wry) S YA. 


366 12. Von Neumann algebras 


Thus, ||(A)]| = || Al], proving that W is isometric. 

Surjectivity. Let » € (B(H),)*. The map (-,-): Hx H — C given by (a, y) := 
p(Wz,y), ty € H, is linear in the left variable and conjugate linear in the right 
variable, and is bounded by ||| because |p(we,y)| < liv kzz.yll < lll ll ll 
Thus, by Lemma 11.30, there is A € B(H) with (Az,y) = (x,y) = y(wa,y) for 
every x,y € H. Hence, (A) = y, because these two elements of (B(H).)* agree 
on B(H),., which is a dense subspace of B(H).. 


Recall that if X is a normed space and N is a weak*-closed subspace of X*, 
then N is a dual Banach space: it is isometrically isomorphic to the dual of the 
closed subspace {kx (x)|w;x € X} of N*; for this see Exercise 24 in Chapter 6. 


Theorem 12.14. Every von Neumann algebra is a W*-algebra. To elaborate, 
if R is a von Neumann algebra on a Hilbert space H, then it is isometrically 
isomorphic to the dual of the Banach subspace {w|p;w € B(H).} of R*. 


Proof. Consider the predual B(H), of B(H) described in Proposition 12.13 (we 
still do not know that it is unique). Since wz, € B(H), for every z,y € H, the 
weak* topology on B(H) induced by B(H), is stronger than the w.o.t. Since 
R is w.o.-closed in B(H), it is also weak*-closed there. The theorem’s assertion 
thus follows from the paragraph preceding the theorem. 


The following result is required in the end of the next section. 


Proposition 12.15. Let R be a von Newmann algebra on a Hilbert space H. 
The topologies induced on the closed unit ball of R by the w.o.t. and by the weak* 
topology of B(H) (given by the above predual) coincide. 


Proof. The w.o.t. is weaker than the weak* topology (on B(H) and thus on 
R). Conversely, let {Aq} be a net in the closed unit ball of R that converges 
in the w.o.t. to an element A of the closed unit ball of R. Let w € B(H), and 
€ > 0 be given. By the definition of B(H), in Proposition 12.13, there is a w.o.- 
continuous linear functional wo on B(H) such that ||w —wol| < ¢. Find ao so 
that |wo(Ag — A)| < € for all a > ag. Then for all such a, 


|u(Aa — A) < |wo(Aa — A) + |(~ — #0) (Aa — A)| 
S€+ |lw — woll (IAall + |All) < 3¢. 


This proves that A, — A in the weak* topology. 
a 


We now turn to Point (B). We will not provide a full proof, but only indicate 
the steps required to prove it. 


Definition 12.16. Let R be a W*-algebra. A linear functional w € R* is called 
normal if for every bounded increasing net {T,,} in R+ with least upper bound 
T € R4, one has w(T) = limg w (Ta). 


Theorem 12.17. Let R be a W*-algebra. Then R has a unique (up to isometric 
isomorphism) predual R.. 


12.5. W*-algebras 367 


In fact, viewed as a subspace of R*, the predual R, consists precisely of the 
normal elements of R*. Also, the set (Rx)4+ i= Rx AR, linearly spans R.. 


Outline of proof. Let ?., be a predual of R. We view R.. as a closed subspace 
of R*. 


1. R is unital. This relies on the fact that the closed unit ball of R is weak*- 
compact by Alaoglu’s theorem (5.24). 


2. Rsa is weak*-closed. 


3. Abundance of elements of Rx: if T € Rsa, then T > 0 iff w(T’) > 0 for every 


4. Every element of 7, decomposes as a linear combination of two Hermitian 
elements of R,; and for every Hermitian w € R, there exist wr,w~ € 
(Rs)4 with w = wt —w and |\w| = ||wt|| + ||w7 || (compare the Jordan 
decomposition for C*-algebras, Theorem 11.32). 


5. Order completeness of Rsa: if {T} is an increasing net in Rs, that is 
bounded from above, then it has a least upper bound in Rg. 
Let us prove this step. We can assume, without loss of generality, that 
{T.,} is norm bounded. Since the closed ball of any radius in R is weak*- 
compact, we may assume by passing to a subnet that {T,,} weak*-converges 
to some T € R. Since R,, is weak*-closed by Step 2, T € Reg. Fix ao. 
For every w € (Rx)+, the net {w(T,)} increases to w(T); in particular, 
w(Ta,) < w(T). By Step 3, Ta, < T. In conclusion, T is an upper bound of 
{Ty}. If S € Rea is another upper bound of {T,}, then w(Ty) < w(S') for 
every w € (R,x)4 and a, so w(T) < w(S). Thus T < S' by Step 3. 


6. An element w of R* belongs to R. iff it is normal. This is proved first for 
positive w and then generally. 


We finish with the second direction of Point (A). 


Theorem 12.18. Every W*-algebra R has a faithful representation as a von 
Neumann algebra on some Hilbert space. 


Proof. The idea is to repeat the proof of the non-commutative Gelfand— 
Naimark theorem (11.26) but take only the normal states of R (i.e., those that 
belong to R.). 

So, denote by S,(R) the set of normal states of R. Using the language of 
Section 11.7, for w € S,(R) let 7m, be the associated GNS representation of 
R on a Hilbert space H,, with cyclic vector ¢,, such that w = we, om as in 
Theorem 11.23 (where H,, is denoted by X,,). Let 


H := ye OH, and 7:= Xx Oty: 
wESn(R) weESn(R) 


The representation 7 of R on H is injective, since if T € R and 7(T) = 0, then 
for each w € S,(R) we have 7,,(T’) = 0, hence w(T) = (we, om )(T) = 0. From 
Theorem 12.17 we infer that w(T) = 0 for allw € span(R.)4+ = Rx, hence T = 0. 


368 12. Von Neumann algebras 


It remains to prove that 7(R) is a von Neumann algebra on H. By Exercise 9, 
this is equivalent to the w.o.-closedness of the (norm-) closed unit ball of 7(R), 
which equals the image under 7 of the closed unit ball of since 7 is faithful and 
thus isometric. To this end, let {T},¢ 4 be a net in the closed unit ball of R such 
that {7(Ta)}aea converges to some S € B(H) in the w.o.t. Then for every w € 
Sn(R) we have w(Ta) = (t(La) Gus Gw) > (S¢.s,¢u). Hence, by Theorem 12.17, 


the (scalar) net {w(Ta)},¢4 converges for each w € R,. Consequently, the map 


>Re li Tey 
p:R aw — lim w( ) 


is a well-defined linear functional on R,.. This functional is bounded by 1, and in 
particular belongs to (Rx)", since {Ta}, 4 is contained in the closed unit ball of 
R. Let T be the element of the closed unit ball of R corresponding to y € (Rx)* 
(recall that R, is the predual of R). This means that {Tu} ,¢,4 converges in 
the weak* topology to T. We leave to the reader to observe that {7(Ta)}aea 
converges to 7(T) in the w.o.t., and thus S = x(T). 


The faithful representation of R constructed as shown is called the universal 
normal representation of R. 


Example 12.19. We can now revisit Example 12.7. Assume that (X, A, 2) is 
a a-finite positive measure space. The C*-algebra L™(X, A, 11) is isometrically 
isomorphic to L'(X,A, 1:)* by Theorem 4.6, and hence it is a W*-algebra. 


12.6 Hilbert—Schmidt and trace-class operators 


This section is devoted to two classes of operators: Hilbert—Schmidt and trace- 
class, and to the relations between them, compact operators and von Neumann 
algebras. Hilbert-Schmidt operators and their basic properties were introduced 
in Exercise 25 in Chapter 8. 

Let H denote a Hilbert space. 

For T € B(H), we would like to define the trace of T to be 


THT) > esses): (1) 


acA 


where {ea} ,e¢,4 is an orthonormal basis for H and the sum is defined as the 
limit of the net {Yep (Tea, Ca) pace where F is the set of all finite subsets 
of A directed by inclusion. For this to make sense, this series has to converge 
(necessarily absolutely) for every choice of orthonormal basis and its sum has to 
be independent of this choice. These conditions do not hold true for every T; 
when they do, we say that T has a trace. If T has a trace and U € B(H) is 
unitary, then UTU* also has a trace and Tr(UTU*) = Tr(T), because U maps 
each orthonormal basis of H onto another one. 


12.6. Hilbert—Schmidt and trace-class operators 369 


For T € B(H), let 


1/2 
IT lo = (= Fel? € [0, oo], 


acA 


where {€q}, is an orthonormal basis for H. This sum does not depend on the 
choice of orthonormal basis, because if {f.}, is another basis, then 


do WPeall? = S) |ea, fa)l? = S5 (ea, T*fs)l? 


acA a,BEA a,BEA (2) 
* 2 * 2 
= SO \(T*fe,ea)? = >> IIT* Fell’. 
a,BEA BEA 


We say that T is a Hilbert-Schmidt operator if ||T\|, < oo. The set of all such 
operators is denoted by L?(B(H)) or L?(#). Observe that 


L*(H) = {T € B(H); {lTeallhaca € P(A)} 
and ||T||, = {Teal bocallecay for all T € L?(H). 
As a result, (L?(H),||-||,) is a normed space with respect to the pointwise 


operations. 
Note that (2) also shows that ||Z'||, = ||/Z*||,. Moreover 


S- Teall” = S- (Teq,Teq) = S- (IT) ea, ea): 


acA acA acA 


Hence, ||T||, = |||T|l|,, and if T € L?(H), then |T|? has a trace and ||T||, = 


1/2 
(Tr(|T|7)) ees any case, since every positive operator has a (positive) square 
root, the sum in [0, oo] of the series (1) is independent of the choice of basis when 
T is positive. 


Example 12.20. Let H be a Hilbert space with orthonormal basis {ea} ce 
and let {Aa}ge4 € (A). Then the diagonal operator T € B(H) given by Tx := 


1/2 
Sac Aa (2, €e) Ca, # EH, is Hilbert-Schmidt, and ||T'|, = (sea lal?) = 
Irctaeallie4y: 


For z,y € H, let Oy. := (-,2)y € B(H). That is, 0,. is the operator H 3 
w — (w,x)y. Then span {@y,2;2,y € H} equals the set of finite rank operators 
in B(H), denoted by F(H). 


Proposition 12.21 (Basic properties of Hilbert—Schmidt operators). 


(i) For every T,S € B(H) we have ||T'llp = ||" ll, |IT I< [IT ll and |TS|lp < 
ZW Silo- Lilo S|]. Thus, L2(H) is a generally non-closed, selfadjoint 
ideal of B(H). 


370 12. Von Neumann algebras 


(ii) F(H) C L2(H) C K(X). 
(iti) For every S,T € L?(H), the operator ST has a trace. 
(iv) The formula (T,S), := Tr(S*T), S,T € L?(H) defines an inner product 


on L?(H) (whose induced norm is |\-||,), making it a Hilbert space. 


Proof. (i) Let T,S € B(H). We proved that ||T'||, = ||T*||,. Every unit vector 
x € H can be complemented to an orthonormal basis of H. Thus, ||T|| < ||Z'||,, 


so ||T|| < ||T||,. Furthermore, if {ea} ,¢,4 is an orthonormal basis of H, then 


ISTIIZ = S— [STeall? < ISI? > [!Teall? = ISI? ITI3 
acA acA 
proving that ||ST'|, < ||S|| |Z'|,, and 
IST Il = CST)" Ip = WTS" lla < TTS lle = IT INS Te - 


(ii) For x,y € H, one sees easily that ||@2,y||, = |x| |lyl], so 82 € L?(H). 
Hence F(H) C L?(H). The fact that L?(H) C K(H), that is, every Hilbert— 
Schmidt operator is compact, was proved in Exercise 25 in Chapter 8. 

(iii) Let S,T € L?(H). By the polarization identity, 


n Lge tk pee. 12 
(ST ea, €a) = (Tea, S*ea) = ri a || Tea akg €o.|| (Va € A). 


2 
2° 


Summing over all a € A, the right-hand side converges to ; Sn i® |Z + i*s* 
This proves that ST has a trace. 

(iv) The fact that (-,-)2, which is well defined by (iii), is an inner product 
on L?(H) is routine. The resulting inner product space is complete. Indeed, 
every Cauchy sequence {T;,}°2, in (L?(H),||-||,) is also Cauchy in B(H) as 
the ||-||,-norm dominates the operator norm. Since B(H) is a Banach space, 
{T,}°2, converges in B(H) to some T. An application of Fatou’s lemma gives 
that T € L*(H) and that {T,,}°2, also converges to T in (L?(H), ||-||,)- 


Definition 12.22. For T € B(H), put 
2 
1/2 
Ill = Do (Tleasea) = |irr? | € (0,00), 
acA 


where {€a},¢,4 is some orthonormal basis of H (since |T| is positive, this sum 
is independent of the choice of basis as explained earlier). If ||T||, < oo, that 
is, if |T| has a trace, we say that T is a trace-class operator, in which case 
||T'||, = Tr(|7'|). The set of all these operators is denoted by L'(B(H)) or L'(H). 


Notice that TE L'(H) <= |T|E D(H) <=> ITI? € 17(H). 


Example 12.23. Let H be a Hilbert space with orthonormal basis {€a} e,; 
and let {Aa}je4 © l'(A). Then the diagonal operator T € B(H) given by 


12.6. Hilbert—Schmidt and trace-class operators 371 


Te = aca Aa (£,€a)€a, © € H, is of trace class, and ||T||, = 7 e4 |Ao| = 
[rckaeallscay: 
Proposition 12.24 (Basic properties of trace-class operators). 

(i) We have L'(H) = {T,T2;T1, To € L?(H)}, and for every T,,T> € L?(H) 
we have ||T1Taly < Till lll 

(it) Every T € L'(H) has a trace, and we have |TrT| < Tr(|T|) = ||T,- 

(itt) (L'(H),\|-||,) is @ Banach space. 

(iv) For every T,S € B(H), we have: ||T\|, = ||T*ll]_, ITI < ITI, and 
TSI, < WTWNS WU WSI. Thus, L'(H) is a generally non-closed, 
selfadjoint ideal of B(H). 

(v) For every T € L'(H) and S € B(H) we have Tr(T'S) = Tr(ST). 

(vi) We have F(H) Cc L'(H) Cc L?(H) Cc K(H), and for every x,y € H we 
have Tr(@, ) = (zy). 

Proof. We will use Proposition 12.21 freely. 

(i)+(ii) Let T1,T2 € L?(H) and let T1T2 = V|TiT>| be the polar 
decomposition of T)7T>. Then V*T; and T> are in L?(H), so that (V*T)T2 = 
|T,T2| has a trace, proving that T;T2 € L'(H). Also, the Cauchy—Schwarz 
inequality shows that 


717 2||, = Tr |TiT2| = Tr((V"T1)T2) = (Za, T7V)2 < ||Tall2 ZT V lo 
S ||Za|l2 ITT ll2 IV] < [Fall WIZille - 
On the other hand, if T € L'(H) has polar decomposition T = V|T|, then 
T =(V|T//?)|T|!/ and |T|'/2, thus also V|T|'/?, belong to L?(H). This implies 


that T has a trace. Furthermore, as demonstrated, by the Cauchy—Schwarz 
inequality, 


rez] = fev iT} (r}/?)| = (TP? IT/2V")2 < |r| |liry/2v 
2 


2 


2 2 
<fize?|| ives |e? |], = tarp. 
2 2 


sak cups 1 1/2||? 1/2||? 

(iii)-+(iv) Let T € L'(H). We have ||T||1 = ites |, = tea | = |\|T|| = 
||T'||. Let again T = V|T| be the polar decomposition of T. We have T* = 
IZ|V* = |Z (Z/7V*), so 


2 
IT" SIPPY WITPPVA < YizPy = rh. 
2 2 2 


Hence, also T* € L1(H), and by symmetry, ||T'|1 = ||7*||1- 

Very similarly, if S € B(H), then writing TS = (V|T|!/?)(|T|!/2S) proves 
that TS € L'(H) and ||TS||, < ||T], ||S|], and by the previous paragraph we 
also have ST € L'(H) and ||ST||, < ||T'|, || S|- 


372 12. Von Neumann algebras 


Suppose that both T,S ¢€ L'(H). Let T+ S = W|T+S]| be the polar 
decomposition of T7+.$. Then W*T,W*S € L1(H), hence, they have traces, and 
thus so does their sum, namely |[T+ S|. Also, |Tr(W*T)| < ||W*T]|, < |Z, 
and the same for S. Therefore ||T + S|], = Tr(|T+S|) = Tr(W*T)+Tr(W*S) < 
Ill, + Sly. 

Finally, we show that (L'(H),||-||,) is complete. Every Cauchy sequence 
{T,,}°2, in (L*(H),||-||,) is also Cauchy in B(H), so it converges in B(H) to 
some T’. Now, continuity of the absolute value map T —> |T| on B(H) and Fatou’s 
lemma yield that T € L'(H) and that {T;,}%, converges to T in (L'(H), |]-||,)- 

(v) By the text in the beginning of this section, for every T € L'(H) and 
every unitary U € B(H) we have Tr(UTU*) = Tr(T). Equivalently for every 
T € L'(H) and every unitary U € B(H) we have Tr(UT) = Tr(TU). By linearity 
and the fact that every element of B(H) is the linear combination of (at most 
four) unitary operators in B(H), we conclude that Tr(ST) = Tr(TS) for all 
T € Li(H) and S € B(H). 

(vi) If zy € H, then [62,4], = ||2\| lull, so O24 € L'(H). This proves 
that F(H) Cc Lt(H). Let T € B(H) have polar decomposition T = V |T]. 
Then 7p = Vir? iri? < |virr?| fire? = vir? iri. 


Therefore L'(H) C L?(H). The equality Tr(9.,,) = (x,y), v,y € H, is easy. 


Remark that the converse of Part (ii) is also true: every operator that has a 
trace is a trace-class operator; see Exercise 17. 


Theorem 12.25. For a Hilbert space H, there are canonical isometric 
isomorphisms K(H)* = L'(H) and L'(H)* = B(H). 


Proof. Part I: L1(H)* = B(H). Define a linear map VW : B(H) > L'(H)* by 
(W(S))(T) := Tr(ST) (S € B(H), T € L+(H)). It is well defined and contractive 
(|| WCS)|| < ||.S|]) because by Proposition 12.24, 


ST € L*(H) and |Tr(ST)| < ||TS|], < ||SI|ITI, (VS € BH), T ¢ L*(H)). 
(3) 


Note that 
(U(S)) (G2,y) = Tr(SO2,y) = Tr(Ose,y) = (Sz,y) (VS € BCH),2,yeEH). (4) 


This implies that W is injective. To show that W is a surjective isometry, let 
y € L'(H)*. Define a map (-,-): Hx H > C by (2, y) := p(Oz,y) (x,y € H). 
It is linear in the left variable and conjugate linear in the right variable. Since 
K(c,y)l < [hell eylly = liellllel|yll, (+) is bounded by |lgl|, so we get from 
Lemma 11.30 an operator S € B(H) with (Sx,y) = (x,y) = y(02,y) for every 
x,y €H and ||S|| < ||y||. Particularly, by (4), Y(S) agrees with y on F(H), 
which is ||-||,-dense in L'(#) by Theorem 12.26; therefore Y(S) = y. Hence, 
[SII < kell = [/¥(S)I| < [ISI], so [/'¥(S)|| = [1S|]. This completes the proof. 

Part II: K(H)* = L'(H). Consider the linear map W : L'(H) > K(H)* given 
by (W(S))(K) := Tr(SK) (S € L'(H), K € K(H)). By (3), W is a well-defined 
contraction, and as in Part I, W is injective. To show that W is a surjective 


12.6. Hilbert—Schmidt and trace-class operators 373 


isometry, let p € K(H)*. Define a map (-,-): H x H > C by (2, y) := p(x) 
(x,y € H). Since |(x,9) < [lgll]Gyll = lel [lal llyl], Lemma 11.30 yields an 
operator S € B(H) with (Sz,y) = (x,y) = p(Oz,,) for every x,y € H. To show 
that S € L*(H), fix an orthonormal basis {ea},¢4 of H, and write the polar 
decomposition S = V |S|. Then (|S| €a,€a) = (V*Se€a,€a) = (Sea, Vea) for all 
a € A. For every finite F C A we have 


> (S| 2,45) — S- (Sea, Vea) = > pc, Ve.) = & Peres) 


acr acer acF ack 


But we have \Soaer Cae | < 1 (why?), so Vaer (|S|€a,€a) < |||. This 
being true for all finite F C A, we conclude that ||S||, = \oae4 (S| €a,€a) < 
\|y||, so that S € L'(H). By construction, ¥(S) agrees with y on F(H), which 
is ||-||-dense in K(#) by Exercise 8 in Chapter 6; therefore U(S) = y. Since W is 
contractive we obtain ||Y(S)|| = ||y|| = ||S|l,- 


We saw that the predual B(H), of B(H), which equals the closure in B(H)* 
of the subspace of w.o.-continuous linear functionals (Proposition 12.13) as well 
as the subspace of normal linear functionals in B(H)* (Definition 12.16 and 
Theorem 12.17), is isometrically isomorphic to L'(H) (Theorem 12.25). This 
isomorphism is given by wz, <> Oz,y, a8 Tr(SOz,y) = (Sz, y) = we,y(S) for all 
S € B(H). We now find another description of B(H).. 


Theorem 12.26. Every T € L'(H) can be written as 


co 
f= II-lla -So Don Yn 
n=1 


(“|-\|,-” means that the series converges in the Banach space (L'(H), ||-||,)), 
where {tn}nat and {Yn}, are sequences in H with YX, llan||? < 0, 
wnat yall” < 00, and ||T\], = Vea Ileal llynll. IFT 2 0, it can be arranged so 
that tn = Yn for everyn EN. 


Proof. We have T € K(H) by Proposition 12.24. Suppose first that T' is 
positive. By the spectral theorem for compact normal operators (Exercise 4 
in Chapter 9), there exist an orthonormal sequence {e,}°>-_, in H (or a finite 
orthonormal sequence if H is finite dimensional) and a non-negative sequence 
{An}, € eo such that T = SO, AnDe,,c,- This convergence holds in 


(L1(H),[flly) because O°, An = [ITI], < 00 and ||7- ON AnBen.en] 


peak ae An for all N € N. Thus, letting rz, := se (n € N), we get a 
sequence with the desired properties. 

For the general case, let T = V|T| be the polar decomposition of T. Note 
that V6z., = Ove, for all z,y € H. Apply the previous paragraph to |T| to 
express it as lao 9y,.yn With {yn} in H satisfying ||T |], = |||T I], = 
ea Ilynll? < co. Then with a, := Vyn (n € N) we have See \lzn||" < 00 


374 12. Von Neumann algebras 


and T = |[-l]}- Vas 92n,.n- Also, since ||V|| < 1, we have Y77 [l2nll llynll = 


Det IV9nll Ilyell S Dope Ulgoll” = [IT Il, S Dea Wcnaell = Dar llenll [lyn 
proving that ||T\], = 27,1 llenll [lynll- 


Corollary 12.27. The predual B(H). consists precisely of all w € B(H)* of 
the form w = >>, We,,y, (the convergence is in the norm topology of B(H)*) 
where {t,}>_, and {yn} >_, are sequences in H with ~~, In|]? < co and 
IIYnll? < 00. It can be arranged that ||w|| = re Wenll luni], and when 
w € B(H). is positive, that also %, = Yn for every n. 


Proof. If {tn }nz and {yn }°2, are sequences in H with 7~, |lan||? < oo and 
~~, Ilynll? < co, then the series 7°, we, y, converges absolutely in B(H)* 
by the Cauchy—Schwarz inequality because ||wz,|| = ||x|| |y|| for each x,y € H. 
Its sum belongs to B(H). by Proposition 12.13. 

The converse, as well as the case of positive elements of B(H),, is just a 
restatement of Theorem 12.26. 


Definition 12.28. The ultraweak (or o-weak) topology on B(H) is the 
topology induced by all linear functionals in B(H)* of the form discussed 
in Corollary 12.27. That is, a net {A,}, in B(H) converges aawea to 
A € B(H) iff for all sequences {x,,}°-_, and {y,}°-_, in H with >>, Ile2n||? < co 
and 37%, |lyn||? < 00, we have w(Aq) i w(A) forari= 7 Wes ay 


Corollary 12.27 means that the ultraweak and the weak* topologies on B(H) 
coincide. 

Finally, we summarize the characterizations of the predual of a von Neumann 
algebra R on H that we have found so far and add two more. Recall that R. 
is identified with the restriction of elements of B(H), to R. By the previous 
paragraph, the weak* topology of R coincides with the ultraweak topology 
induced from B(H). Let w € R*. Then w € Ry, that is, w is weak*-continuous 
(equivalently: normal) iff it is ultraweakly continuous. In addition, by the Krein— 
Smulian theorem (5.52), w is weak*-continuous iff its restriction to the closed 
unit ball of R is weak*-continuous, and by Proposition 12.15, this holds iff this 
restriction is w.o.-continuous. We proved: 


Theorem 12.29. Let R be a von Neumann algebra on a Hilbert space H. For 
every wW € R*, the following conditions are equivalent: 


(i) w is weak* -continuous, that is, w © R.; 

(ii) w is ultraweakly continuous: there exist sequences {x,}>-_, and {yn}, 
in H with S°°, |lan|l? < co and 37°, |lyn|l? < co such that w = 
aes 1 2n, yn lRi 

(iti) w is normal; 


(iv) the restriction of w to the closed unit ball of R is weak*-continuous (i.e., 
ultraweakly continuous); and 


12.7. Commutative von Neumann algebras 375 


(v) the restriction of w to the closed unit ball of R is w.o.-continuous. 


When w satisfies these equivalent conditions and is positive, it can be arranged 
80 that &p = Yn in (ti). 


12.7 Commutative von Neumann algebras 


This section presents some of the main results on the structure of commutative 
von Neumann algebras. In comparison to the commutative Gelfand—Naimark 
theorem, which says that a commutative C*-algebra “is” the Co algebra of a 
topological space, we will see that a commutative von Neumann algebra “is”, 
roughly, the L°° algebra of a measure space. 

Recall from Section 12.1 the notion of spatial equivalence of von Neumann 
algebras and from Section 12.2 the notion of a cyclic vector of a von Neumann 


algebra. We use the notation M, introduced in Example 12.7. 


Theorem 12.30. Let R be a commutative von Neumann algebra on a Hilbert 
space H admitting a cyclic vector. Then there exists a finite positive measure 
space (X,A,) such that R is spatially isomorphic to the von Neumann algebra 
{My; f € L©(X, A, 1)} acting on L?(X, A, p). 


Proof. Since R is a unital commutative C*-algebra, the commutative Gelfand— 
Naimark theorem (7.16) says that there exist a compact Hausdorff space X and 
a *-isomorphism T' from R onto C(X). 

Let ¢ € H be a unit cyclic vector for R and consider the state w := wc|r 
of R. Then wo! is a state of C(X), so by the Riesz—Markov theorem (3.18) 
there are a o-algebra A on X and a positive measure yp on (X,.A) such that 


(woP)(f) = I fdp (Vf eC(X)). (1) 


In particular, taking f := 1 leads to u(X) = 1. 

Define an operator Up : R¢ > C(X) by TG — I(T), T € R. Then, viewing 
C(X) as a normed subspace of L?(X,.A, 1), Uo is (well-defined and) isometric, 
because for each T, writing f := I(T) we have |f|? =IT'(T*T) and 


lfllze =f \fP au = wor \(lsP) = (rr) = eal 


by (1). Also RC = H since ¢ is cyclic for R, and C(X) is dense in L?(X, A, ) by 
Corollary 3.21 of Lusin’s theorem. All in all, Up extends to a unitary equivalence 
U:H- L7(X,A,p). 

Moreover for every T € R we have UTU* = Myr). Indeed, for each S € R, 


(UT)(S¢) = U(P86) = PPS) = F(T) -T(8) =P (L) - (USC) = Mr (U6). 


In particular we have URU* = {My;f € C(X)}. But URU* is evidently a 
von Neumann algebra on L?(X,A,), and {My; f € C(X)} is s.o-dense in the 


376 12. Von Neumann algebras 


von Neumann algebra {M;; f € L°(X,A,)} by Exercise 18. Hence, the two 
algebras coincide. 


Remark 12.31. The map that takes f ¢ L°(X,A, 4) to My is injective. Thus, 
the end of the proof shows the surprising equality C(X) = L~(X, A, u) for this 
specific (X, A, 1); that is, for every f € L°(X, A, ) there exists f’ € C(X) such 
that f = f’ a.e.! The reason for this seemingly unnatural equality has to do with 
the topology of X and the nature of the measure yu. We will not elaborate. 


We conclude this section with two more theorems about commutative von 
Neumann algebra, which are given without proofs: the first deals with ones 
acting on separable Hilbert spaces, and the second discusses the general case. 


Theorem 12.32. Let R be a commutative von Neumann algebra on a separable 
Hilbert space. Then there exists a finite positive measure space (X,A, 4) such 
that R is *-isomorphic to L°(X, A, p). 


In comparison to Theorem 12.30, in which the existence of a cyclic vector 
guarantees a spatial isomorphism between R and L™©(X,A, 4) when the latter 
acts canonically on L?(X,A, 1), in the situation of Theorem 12.32 we can only 
deduce that R and L*°(X,.A, 4) are isomorphic as C*-algebras. 


Theorem 12.33. Let R be a commutative von Neumann algebra. Then R is 
spatially equivalent to the direct sum of a family {Ra} ye, of von Neumann 
algebras such that for each a € A there exists a finite positive measure space 
(Xa,Aa;Ma) such that Ra is *-isomorphic to L° (Xa, Aa; Ma): 


12.8 The enveloping von Neumann algebra of a 
C*-algebra 
This section demonstrates how to obtain a canonical von Neumann algebra from 


an arbitrary C*-algebra A. 
We begin with a non-degenerate representation 7 of A on a Hilbert space H. 


—-w.0.t 


Then 7(A) is a von Neumann algebra on H. For X € A**, consider the 
function (-,-): Hx H — C given by 


(2, y) = X (Way 07) (x,y € H) 


(which is well defined as wz,, 0 7 belongs to A*). It is linear in the left variable 
and conjugate linear in the right variable, and for every x,y € H we have 


(2, y)| SNA [eel Marl SAT ely 


is bounded by ||X||. By Lemma 11.30, there exists a (unique) operator 
B(H) such that for all z,y € H, 


(7(X)a, y) = (x, y) ’ 


Rr 
a 
(4 ei 


12.8. The enveloping von Neumann algebra of a C*-algebra 377 


that is, 
(w(X)x,y) = X(wz,y 07). (1) 


We claim that #(X) € mA)", and if X € A (when A is viewed as 
embedded in A**) then #(X) = 7(X). The latter assertion follows from (1) 
since (1(X)x,y) = (wz, 0 7)(X) for all «,y € H when X € A. To prove the 
former assertion, let T’ € (m(A))’. Then for all 2,y € H the functionals wre,y 
and w, (7/)+, coincide on 7(A), namely, wre. 01 = Wz,(7/)+y 07, and thus, 


(@(X)T"2, y) = X(wpray OT) = X (WaT) +y OT) 
= (#(X)a, (T’)*y) = (T'7(X)a,y), 


proving that 7(X) commutes with T’, hence 7(X) € ((A))” = mA)?" by 
von Neumann’s double commutant theorem (12.4). 
Since X € A** was arbitrary, we have constructed an extension 


oe At + (Ay? 


of 7, which is clearly linear and of norm at most 1. Also, by (1) and the fact that 
{Wz,y;t,y € H} spans a dense subspace of B(H), (Proposition 12.13), we have 


w (7(X)) = X(wo7) (VX € A*™ we B(H).). (2) 


It is therefore clear that 7 is weak*-weak*-continuous, or equivalently, weak*- 
ultraweakly continuous. Notice that by (2), 7 is nothing but the Banach adjoint 


of the bounded linear map (Ay), — A* given by w > wor. 

(Remark that we could equally define 7 by (2) or as in the previous sentence. 
This would be more “to the point”. However, starting with (1) and essentially 
repeating part of the proof of Proposition 12.13 should be more illuminating to 
the reader.) 


The closed unit ball of 7(A) is w.o.-compact and it contains the image 
under 7 of the closed unit ball of A**. Furthermore, 7 maps the open unit ball 
of A onto that of (A) (see Exercise 2 in Chapter 11), and the latter is w.o.- 
dense in the closed unit ball of mA?" by Kaplansky’s density theorem (12.8). 
Finally, the closed unit ball of A** is weak*-compact by Alaoglu’s theorem (5.24), 
and therefore it is mapped by the weak*-ultraweakly continuous map 7 to an 
ultraweakly compact subset of mA?” All in all, 7 maps the closed unit ball of 
A*™* onto that of mA). In particular, this implies that 7: A** > mA? 
is surjective. 

Recall from the non-commutative Gelfand—Naimark theorem (11.26) that 
the universal representation of A, to be denoted here by vy, is the direct sum 
representation )/,<s(4) ®Tw, where S(A) is the set of all states of A and for 
w € S(A), my is the cyclic representation of A on a Hilbert space H,, associated 
with w by Theorem 11.23. The map 7, is a non-degenerate representation of A 
on the Hilbert space Hy := west A) OHy. 


378 12. Von Neumann algebras 


Apply the shown construction to 7, to yield a surjective, weak*-ultraweakly 
continuous, contractive linear map 7 : A** > Tu (Ayre. We prove in the next 
paragraph that 74 is injective. This has two implications, both relying on the 
fact that the restriction of 7, to the closed unit ball of A** has the closed unit 


ball of tu (Ay as its range. First, 74 is isometric. Second, this restriction of 
7, is injective and weak*-weak*-continuous, its domain is compact and its range 
is Hausdorff in the respective topologies. Consequently, this restriction of 7y is a 
weak*-weak*-homeomorphism. By the Krein-Smulian theorem (5.52), 7, itself 
is a weak*-weak*-homeomorphism. 


It remains to prove that 7 is injective. As explained, 7, is the Banach adjoint 
w.o.t. 


of the bounded linear map (7a)« : (Tu(A) x 3 A* given by w + wom. The 
map (7)« is surjective, because every p € A* is the linear combination of states 
of A by the Jordan decomposition (Theorem 11.32), and by the universality of 


Tu, every state p of A can be written as w, o 7, for a suitable x € H,, (and the 
we 
vector functional w, restricts to an element of (74(A)’).!). It follows from 


Part (e) of Exercise 9 in Chapter 6 that 7, is injective. 


Theorem 12.34. Let A be a C*-algebra. Viewing A as a Banach subspace of 
A**, the universal representation 7, of A extends uniquely to a weak* -ultraweakly 


. : ~ —w.0.t. 
continuous function Ty, : A** > m(A) . Furthermore: 


1. 7 is linear, isometric, surjective, and a weak*-ultraweak-homeomorphism. 
2. For every non-degenerate representation x of A on a Hilbert space H there 


——w.0.t. 
exists a unique ultraweakly continuous function from m,(A) to B(H) 
mapping 7,(A) to m(A) for each A € A. This map is a representation of 
m(A) and its range is ar 
Proof. Uniqueness is clear by the weak*-density of A in A**. Existence and (i) 
were proved earlier. 
(ii) Uniqueness is again clear by density. The linear map II := 7 0 (7™%)71 : 
——.w.0o.t. ——w.0o.t. 
Tu(A) > 1(A) , which is surjective and ultraweakly continuous, maps 
Ty(A) to 7(A) for each A € A. It remains to prove that II is a *+-homomorphism. 
Let T,S € m(Ay and let {Aa}aer {Bstge, be nets in A such that 
the nets {7u(Ao)}aer{Tu(Bs)} ge, converge ultraweakly to T,S, respectively. 
For every fixed 6 € J, the net {tu(AaBa)} ye, = {Tu(Aa)™u(Ba)}  e7 converges 
ultraweakly to T'7,(Bg), and thus the ultraweak continuity of II implies that 


UW (T7(Bag)) _ din U1 (™(AaBg)) _ im ™(A,Ba) = lim 1(Aq)m(Bg) 
3 
= lim 1 (a (Aa)) H (ma(Ba)) = W(P)H (mu(Bz)) - 


Since also the net {II(7(Bg))},-, converges ultraweakly to I(S), taking now 
the limit with respect to 8 in (3) gives W(TS) = I(T)II(S), proving that II is 
multiplicative. A similar (and easier) argument shows that II is +-preserving. 


Exercises 379 


The von Neumann algebra 774 (ay is called the enveloping von Neumann 
algebra of the C*-algebra A. Part (ii) of the theorem is a universality property 
of this von Neumann algebra in terms of the representations of A: every 
non-degenerate representation of A “extends” to an ultraweakly continuous 
representation of its enveloping von Neumann algebra. 


Exercises 


Let H be a Hilbert space and R be a von Neumann algebra on H. 


1. Let Z be a w.o.-closed ideal in R. Prove that there exists a projection P 
in the center RNR’ of R such that T= RP := {AP; AE R}. 


2. Let U € R be unitary. Prove that there is a selfadjoint A € R such that 
U =e'4, (Compare Exercise 28 in Chapter 7.) 


3. Prove that a vector « € H is separating for R iff the positive functional 
w/e is faithful (see Exercise 19 in Chapter 11). 


4. Prove Lemma 12.10. 
5. Prove that if T € B(H) is normal, then its support is TH. 


6. (a) Let A € B(H),. Prove that ker A = ker A!/?. In fact, with a little 
more thought, prove that ker A = ker A® for every a > 0. 


(b) Let T € B(H). Prove that the kernels of T,T*T,|T| are equal, and 
thus so are their supports. 


7. (The polar decomposition of unbounded operators.) In this exercise we 
extend Theorem 12.11. We use freely the terminology and results of 
Chapter 10 and Exercises 15 and 17 therein. Let T be a (generally 
unbounded) closed densely defined operator on H. 


(a) The support of T is defined to be the closed subspace (ker T')+ of H. 
Prove that it equals T*H. 


(b) Imitating Exercise 6, show that the kernels of T,T*T, |T'| are equal, 
and thus so are their supports, and that the latters are equal to |T|H. 


(c) Existence: using Exercise 17 in Chapter 10, verify that the proof of 
Part (i) of Theorem 12.11 carries verbatim, proving the existence of 
a partial isometry V € B(H), whose initial space is the support of T 
and whose final space is TH, such that T = V |T). 


(d) Uniqueness: using Exercise 15 in Chapter 10, verify that the proof 
of Part (ii) of Theorem 12.11 carries verbatim, proving that if A is a 
positive selfadjoint operator on H and W € B(H) isa partial isometry 
with initial space AH such that T = WA, then A = |T| and W =V. 


380 


11. 


12. 


13. 


12. Von Neumann algebras 


(e) Verify that the proof of Part (iii) of Theorem 12.11 also carries. 


. Under the conditions of Kaplansky’s density theorem, show that for every 


T € R there is a net {Ty} in A that converges to T in the s.o.t. and 
satisfies ||T.|| = ||Z'|| for each a. (Hint: Exercise 25 of Chapter 6.) 


. Prove that a non-degenerate *-subalgebra of B(H) is a von Neumann 


algebra on H iff its (norm-) closed unit ball is w.o.-compact. 


. Prove that cg := Co(N) is not a dual Banach space. 


Prove that the Banach space I! admits (at least) two non-isometrically 
isomorphic preduals. (Hint: use Exercises 8 and 9 in Chapter ??.) 


Let X be a predual of a Banach space Y. View X as a closed subspace of 
Y* as explained in Section 12.5. Then: 


(i) X C Y* is norming for Y, that is, for every y € Y, |ly|| = 
sup{x(y);x € X, |||] < 1}. 


(ii) The closed unit ball of Y is compact in the X-topology. 


Indeed, (i) follows since Y is isometrically isomorphic to X* and we view 
X as embedded in Y* canonically, and (ii) holds by Alaoglu’s theorem. 
We show that properties (i) and (ii) characterize X being a predual 


of Y. Thus, let again Y be a Banach space and X be a closed subspace 
of Y* such that (i) and (ii) hold. 


(a) For y € Y, consider the map f(y) := glx : « > 2(y) on X. It is 
obviously a bounded linear functional, thus an element of X*. Explain 
why f : Y — X* is a linear isometry. 


(b) Prove that the range of f is dense in X* in the X-topology. 


(c) Explain why the image under f of the closed unit ball of Y is compact 
in the X-topology, and deduce that the range of f is closed in the 
X-topology. In conclusion, f is onto X*. 


Let R 1, Rg be von Neumann algebras. 


(a) Let ® : Ri > Re bea linear map. Prove that the following conditions 
are equivalent: 
(i) ® is weak*-continuous (i.e., continuous when both R, and R42 
are equipped with their respective weak* topologies); 
(ii) ® is ultraweakly continuous; 
(iii) the restriction of ® to the closed unit ball of Ri is weak*- 
continuous; and 


(iv) the restriction of ® to the closed unit ball of Ry is w.o.- 
continuous. 


Exercises 381 


14. 


15. 


16. 


17. 


18. 


19. 


Prove that if ® is a homomorphism, then the conditions shown are 
also equivalent to: 
(v) ® is normal: for every bounded increasing net {Ty}, in (Ri), 
with least upper bound T, the least upper bound of {®(T.)},, 
in (R2), is &(T); equivalently, ®(T) = lim, ®(7),) in the w.o.t. 
(b) Prove that every *isomorphism from R, onto Rg is weak*- 
continuous. 


(c) Prove that if ® : Ry > Re is a unital «homomorphism satisfying 
the equivalent conditions of Part (a), then its image ®(R1) is a von 
Neumann subalgebra of Ro. 


Recall from Section 11.8 that the set S(R) of states of R is convex and 
weak*-compact, that is, compact with respect to the weak* topology on 
R*. Prove that the set S,(R) of normal states of R is weak*-dense in 
S(R). 


Prove that a *-subalgebra of B(H) is ultraweakly closed in B(H) iff it is 
w.o.-closed in B(H) (iff it is s.o-closed in B(H)). 


Prove that Tr(TS) = Tr(ST) for all T,S € L?(H). 


We proved in Proposition 12.24 that every element of L1(#) has a trace. 
In this exercise we prove the converse. In fact, we show that the following 
a priori weaker condition on T € B(H) implies that T € L1(H): for each 
orthonormal basis {€a},¢4 of H we have )) ye 4 |(Tea; €a)| < ©. 


(a) Recall that we denote RT := $(1+7*) and ST := +(T—T*). Show 
that |((RT)e, e)|, |((ST)e, e)| < |(Le,e)| for alle EH. 

(b) Use Part (a) to reduce the problem to selfadjoint operators T. 

(c) Assuming that T is selfadjoint, let E be the resolution of the identity 
for T. Let {ea} yep and {€a},¢c be orthonormal bases of E([0,00))H 
and E((—oo,0])H, respectively, with BNC = 0. Then, with A := 
BUC, {ea}gea is an orthonormal basis of H. Prove that ||T'||, = 
eed (Peer Bee)|s 


Let (X,A, 4) be a positive measure space as in Lusin’s theorem (3.20) 
and assume additionally that it is o-finite. Prove that {My; f €¢ C.(X)} 
is s.o.-dense in the von Neumann algebra { My; f € L°(X, A, u)} 

(cf. Corollary 3.21). 

(a) Prove that R is commutative iff R CR’. 


(b) Prove that R is a maximal commutative von Neumann subalgebra of 
B(H) iff R= PR. 
Henceforth we assume that R is commutative. 


(c) Prove that every cyclic vector for R is also separating for R. 


382 


20. 


21. 


12. Von Neumann algebras 


(d) Prove that if R has a cyclic vector, then it is a maximal commutative 
von Neumann subalgebra of B(H). 


Prove that if H is separable then the converse of Part (d) holds: if R 
is a maximal commutative von Neumann subalgebra of B(H), then 
R has a cyclic vector. (Hint: use Zorn’s lemma to obtain a (possibly 
finite) sequence {z,} in H such that the family {Rz,,} consists of 
mutually orthogonal closed subspaces of 4 whose direct sum is H. 
(notice that the projection P, of H onto Ra, belongs to R’ = R). 
Set x := 0, 42. Show that for each n we have P,x = +2, and 
therefore Ra contains Rx,.) 


(e 


cea 


What is the relationship between the universal representation of a C*- 
algebra A and the universal normal representation of the enveloping von 
Neumann algebra of A? 


Let A be a C*-algebra, and let 7 : A** > m(Ay be the isomorphism 
from Theorem 12.34. Define a product on A** by “pulling back” the 
product on 7(A)”, that is, letting X -Y := (#)7'(fa(X)alV)) for 
X,Y € A**. Prove that this product is the same as the first and second 
Arens products on A** (see Section 7.5). In particular, A is Arens regular. 


13 


Constructions of 
C*-algebras 


This chapter provides two examples of constructions of C*-algebras, namely, 
tensor products of C*-algebras and group C*-algebras, which are fundamental and 
prevalent in C*-algebra theory. Both constructions have von Neumann algebraic 
analogues, which in fact preceded the C*-algebraic ones and go back to Murray 
and von Neumann. Tensor products of von Neumann algebras are discussed in 
the exercises. 

The theory of tensor products (initially called direct products) of C*-algebras 
started with the work of Turumaru in 1952 and received a big push from Takesaki. 
To define a tensor product of two C*-algebras A and 6 we need a norm on the 
algebraic tensor product A @aig B with respect to which the completion is a 
C*-algebra. Surprisingly, there is usually more than one such norm, but there 
is always a smallest and a largest. Section 13.1 identifies these two norms and 
characterizes the associated tensor products. 

For a locally compact topological group G, the commutative C*-algebra 
Co(G) takes into account only the topology on G and ignores the group structure 
of G. Section 13.2 constructs the group C*-algebra of G, denoted C*(G). This 
is a C*-algebra reflecting both the topology and the group structure of G. 
The C*-algebra C*(G) is constructed in such a way that its non-degenerate 
representations are in bijective correspondence with the (s.o.-continuous) unitary 
representations of G. When G is discrete, C*(G) is a C*-algebraic analogue of 
the complex group algebra C[G]. 


13.1 Tensor products of C*-algebras 


This section shows how one can tensor C*-algebras. Tensor products are funda- 
mental in operator algebras and form a rich source of examples. 


Introduction to Modern Analysis. Second Edition. Shmuel Kantorovitz and Ami Viselter, Oxford University Press. 
© Shmuel Kantorovitz and Ami Viselter (2022). DOI: 10.1093/os0/9780192849540.003.0013 


384 13. Constructions of C*-algebras 


Roughly speaking, the tensor product of two algebras is to multiplication 
what the direct sum of two vector spaces is to summation and what the direct 
product of two groups is to multiplication. The direct sum U @ V of two vector 
spaces U,V is a new vector space that contains copies of U and V in a way 
that they span U @ V and do not “interact” linearly, that is, UM V = {0}. 
Similarly, the direct product H x K of two groups H, K is a new group containing 
commuting copies of the groups H and K that generate H x K as a group and 
do not interact algebraically apart from commutation. Likewise, the (algebraic) 
tensor product A @® B of two (say unital) algebras A,B will be a new algebra 
containing commuting copies of A and B that generate A@ 6 as an algebra and 
do not interact algebraically apart from commutation. (Remark that removing 
commutativity of the copies of A and 6 from this last description leads to free 
products of algebras, which are not discussed in this book.) 

A C*-algebraic tensor product of two C*-algebras A and B is a C*-algebra 
containing the algebraic tensor product of A and B as a dense subalgebra. Thus, 
one should come up with a “suitable” norm on the algebraic tensor product and 
then complete it. However, it turns out that finding such a norm is not at all 
straightforward. In fact, usually there is more than one such norm! 

We assume that the reader is familiar with Section 8.8. 


13.1.1 Tensor products of algebras 


We begin with the algebraic setting. Let A,B be algebras over a field F. Consider 
the algebraic tensor product A ® B of A and B as vector spaces. Recall that 
A® B is spanned by the “simple tensors” a@ b (a € A, b € B) and that the map 
(a,b) > a@ b is bilinear. 

Endow the vector space A® B with the product defined on simple tensors by 


(a1 @ bi) - (a2 @ bg) := (a1a2) @ (bib2) (a1, a2 € A, bi, be € B) 


and extended linearly. It is immediate from the definition of the vector space 
A® B that this product is well defined and that it makes A® 6 into an algebra 
over F. The latter is called the tensor product of the algebras A and B, and is 
also denoted by A® B. 

If B is unital with unit 1g, then the map a > a® Ig is an injective 
homomorphism from A into A ® B (injectivity follows from Part (vi) of 
Theorem 8.16), giving a “copy” A@ 1g := {a@1g;a € A} of the algebra A 
inside the algebra A ® B. A similar statement holds true when A is unital. 
Hence, when both A, 86 are unital, there are canonical commuting copies of A 
and B inside A ® B, which generate A ® 6 as an algebra because a © b = 
(a @ Ig)(1y ® b) = (Ly @ b)(a @ 1p) for alla ec Aand bE B. 

Parts (i)—(iv) of Theorem 8.16 have analogues for tensor products of algebras, 
in which the linear maps are algebra homomorphisms. 


Example 13.1. If A is an algebra over a field F and n € N, then the algebras 
M,(F) ® A and M,,(A) are canonically isomorphic. Indeed, writing (€i;)1<; j<n 
for the matrix units of M,(F), the map that sends a matrix (aij), <; j<, € Mn(A) 


13.1. Tensor products of C*-algebras 385 


to ee €j7 ® aij € Mn(F) ® A is a surjective algebra isomorphism. Indeed, it 
is a surjective linear map, which is injective by Part (vi) of Theorem 8.16, and 
is also multiplicative. 


For the next example, recall that for a group G and a field F, the group 
algebra F|G] is the set of all formal sums of the form )?)-¢ agg, where a, € F 
for all g € G and {g € G; a, # 0} is finite. It is a unital algebra with respect to 
the operations 


So agg + >> Bog = So (ag + By) 9 


gEG gEG gEG 
a S- agg = (aag) 9g 
gEG gEG 
So a99| | >> Bog} = S- (@q:Bo2) (9192) 
gEG gEG 91,92EG 


for Vigca C99) Ligea Fog € F[G] and a € F, and the unit lre. Write g for lpg. 
(As a vector space, F[G] is the direct sum of |G| copies of F.) 


Example 13.2. If H,K are groups and F is a field, then the algebras F/H] ® 
F[K] and F[H x K] are canonically isomorphic through the map that sends 
nex Kk O(h,k) (Rk) € FLA x K] to an menxK (hb) h@ lek € FLA) @F[K]. 


13.1.2 Tensor products of C*-algebras through 
representations 


We return to the analytic setting. Henceforth, we denote the tensor product 
of two Hilbert spaces H,K by H @ K, and recall that this Hilbert space is the 
completion of the algebraic tensor product H @alg K with respect to a canonical 
inner product. We also denote the tensor product of two algebras A, 5, as defined 
earlier, by A @alg B. If A, B are *-algebras, then A @aig B becomes a +-algebra 
with the involution (a @ b)* := a* @ b*. 

A norm ||-|| on a *-algebra C is called a C*-norm if it is submultiplicative 
(|zyl| < |x|] ||y|| for all x,y € C) and satisfies the C*-identity (||z*a|] = |la||? 
for all x € C). A semi-norm with the same additional properties is called a C*- 
semi-norm. The completion of C with respect to a C*-norm becomes a C*-algebra 
upon continuously extending the products and the involution. Such a C*-algebra 
is called a C*-completion of C. 

Let A,B be C*-algebras. A C*-algebraic tensor product of A and B is a C*- 
completion of A @aig B. It is natural to ask whether we should “accept” any 
C*-norm on A aig B or be more restrictive. In particular, it would be desirable 
to have the additional property 


la @ b] = lalla leg Wace Abe B). (1) 


386 13. Constructions of C*-algebras 


A norm on A @aig 6 satisfying (1) is called a cross norm. This notion makes 
sense whenever A, 6 are Banach spaces, and it is indeed taken from the theory 
of tensor products of Banach spaces, not discussed in this book. Notice that the 
norm on the tensor product of Hilbert spaces is a cross norm; and moreover, 
if H1,K1,H2,K2 are Hilbert spaces and T € B(Hi,H2),S € B(Ki,K2), then 
||T ® S|] = ||Z| [S|] by Part (vi) of Proposition 8.18. 

It is a surprising fact that all C*-norms on the algebraic tensor product 
A ®aig B of two C*-algebras are cross norms! The proof of this in Corollary 13.17 
requires several stages. As an appetizer, we show that if A,B are unital 
C*-algebras, then every C*-norm on A Q@aig B satisfies ||a @ bl] < |lally |lOll, 
for all a € A and b € B. Indeed, the maps 


a>a@lgandbo1,®&b (2) 


are injective x-homomorphisms from A and B, respectively, to the C*-completion 
A ® B of A aig B associated with the given C*-norm. These maps are thus 
isometric by Theorem 11.2. As a result, for all a € A and b € B we have 


|| @ dl| = ||(@@ 1s)(14 @ d)]| < [la @ Lal 14 @ Bl = [lal], [lls - 


In order to proceed we need to examine the representations of A @aig B. This 
occurs in Propositions 13.3, 13.4, and 13.6. By a representation of a *-algebra 
C on a Hilbert space H we mean a x-homomorphism 7 from C to B(H). Such 
w is called faithful if it is injective, and it is called non-degenerate if m(C)H := 
span {7(X)¢; X €C,¢ € H} is dense in H. 

Representations matter because they are the source of C*-norms. 


Proposition 13.3. Let C be a x-algebra. If 7 is a representation of C on a 
Hilbert space H, then the formula 


IX = Ilr Ollaay &€C) 


defines a O*-semi-norm ||-||, on C, which is a O*-norm (i.e., definite) iff a 
is faithful. Conversely, every C*-norm on C equals ||-||, for some faithful non- 
degenerate representation m of C. 


Proof. The first assertion follows readily from 7 being a *-homomorphism and 
B(H) being a C*-algebra. 

Conversely, let ||-|| be a C*-norm on C and write C'' for the associated 
C*-completion. Let 7 be a faithful (hence isometric, by Theorem 11.2) non- 


degenerate representation of ae such a exists by the non-commutative 
Gelfand—Naimark theorem (11.26). Then alc is a faithful non-degenerate 
representation of C, and ||X|| = ||Xlai-i = |lt(X)|leq@y for all X € C by 


isometricity of 7. 


The C*-completion of C associated with a C*-norm ||-||, as shown (when 7 
is faithful) is naturally identified with the (norm-) closure of 7(C) in B(H). 

The next proposition analyzes the general structure of the representations of 
A Walg B. 


13.1. Tensor products of C*-algebras 387 


Proposition 13.4. Let A,B be C*-algebras. 


(i) If ta, 7B are representations with commuting ranges of A,B, respectively, 
on the same Hilbert space H, then there exists a unique representation 
TA+TB Of A@alg B on H satisfying 


(74+ 7pB)(a @ b) = 74 (a)7(b) (Va € A,be B). (3) 


Also, 74 -7p is non-degenerate iff both t,4 and mg are non-degenerate. 


(ii) Every non-degenerate representation 1 of A aig B is equal to 74 + mg for 
unique representations with commuting ranges 74,7 of A,B, respectively. 


The representations 74, 7 of Part (ii) are called the restrictions of 1 to A,B, 
respectively. The reason for this is that when both algebras are unital, 74,7, 
are obtained by composing 7 with the embeddings (2). 


Proof. (i) Since the map (a,b) > 74(a)7g(b) from A x B to B(H) is bilinear, 
a unique linear map 74: 7p : A @ag B + B(H) satisfying (3) exists. It is 
multiplicative and *-preserving because so are 7.4,7g and because their ranges 
commute. 
The assertion on non-degeneracy holds since (74 - 7p)(A @aig BYH = 
TA(A) (78(B)H) = 78(B) (ma(A)H). 
(ii) Let 7 be a representation of A ®aig B on a Hilbert space H. 
It is quite clear how 74,7 need be defined: for example, we should have 
ta(a)(m(x @ y)C) := malar ® y)¢. We will show that this makes sense and 
gives representations with the desired properties. The trick is to use positive 
functionals. 

For each a € A define a linear operator 74(a) on 7(A @aig BYH by 


n n 


TA(a) do (0: @ w)G = Do mari @ HG 


i=1 w=1 
(n€ Nanda, € Ay € BG €H forl<i<n). (4) 


To show that these operators are well defined and bounded, fix n € N and 2, yi, ¢; 
(1 <i <n) as in (4) and consider the linear functional w on the unitization A* 
of A given by 


n n 


w(x) = [>i a(eai @ HG, > 127 @ WN)G (x € A*) 


i=1 j=l 


(if we already knew that 7,4 was well defined, we would simply have w(x) = 
(ma(x)€,€) for € := S7_, m(a; @ yi)G;). We assert that w is positive. Indeed, for 


388 13. Constructions of C*-algebras 


all a € A# we have 


w(a*a) = So (a*ax; ® yi)Ci, m(2; @ yz )G;) 
t,j=1 


n 


= y (n(x; & Yj) T(a* ax; @ yi)Gis Gj) 


i,j=l 
n n 
= S (n(w3a*axi ® yiyi)G,C)) = Pa n(ax; ® yj)*n(axz ® yi)G, ¢;) 
ij=l Jal 
n n 
i=l j=l 


As a result, ||w|| = w(1_4#) by Theorem 11.18. Hence, for all a € A* we have 
w(a*a) < |la*al| w(1_4#), which exactly means that 


2 
2 
< |lal| 


i 2 


dm (ax; ® yi) 


n 


Yi rai @ HG 


i=1 


This being true for all a € A# and all Xi, Yi,¢; aS above proves that for each 
a € A, the formula (4) gives rise to an operator 7,4(a) on 7(A @alg B)H of norm 
at most |la||. The subspace 7(A @aig B)H is dense in H as 7 is non-degenerate, 
thus, 7,4(a) extends to an element of B(H) of the same norm, also denoted 
mwa(a). Straightforward calculations show that 7, is a x-homomorphism, thus a 
representation of A on H. 

One does the same for 7g. Evidently, 7,4 and 7g have commuting ranges and 
T=T7A+ TB. 

Lastly, suppose that also 7 = a’, - mg for non-degenerate representations 
m4, as in Part (i). Fix an approximate identity {e,} of A and let b € B. 
For every a we have 7,4(€a)7B(b) = T(€a ® b) = 71/4 (€a) (0). By Exercise 16 
in Chapter 11, taking the s.o.-limit in a yields that 7g(b) = 7/,(b). Similarly, 
TA = Tg. 


An immediate result is that every C*-norm on A @alg B is at least “half” a 
cross norm: 


Corollary 13.5. Let A,B be C*-algebras. 
(i) Every representation 7 of A @aig B satisfies 
IIr(a@ bl <llallylllg  WaeA,beB). 
(ii) Every C*-norm ||-|| on A @aig B satisfies 


la@ bl <llallyllells Wace A,be B) 


13.1. Tensor products of C*-algebras 389 


(i.e., this inequality is satisfied in every C*-completion of A @aig 8). 


Proof. (i) We can assume that 7 is non-degenerate, because otherwise it can be 
replaced by its restriction to a non-degenerate representation on 7(A @aig B)H, 
where H is the Hilbert space on which 7 represents. 

With m4,78 as in Part (ii) of Proposition 13.4 we get ||7(a@6)|| = 
\|7.4(a)7pB(b)|| < |lal|, ||b\|, because representations of C*-algebras are 
contractive. 

(ii) This follows from Part (i) and Proposition 13.3. 


We have been discussing C*-norms on A ®aig B, but we still have not proved 
the existence of such norms. By Proposition 13.3, this amounts to exhibiting 
faithful representations of A ®aig B. This is what we do next. 

Let p,o be representations of the C*-algebras A,6 on Hilbert spaces H,K, 
respectively. The linear map 


9@0:AQalg B > B(H) @aig B(K) 


sending a ® b to p(a) ® a(b) for a € A,b € B (see Part (iv) of Proposition 8.16) 
is a *-homomorphism of +-algebras. It is injective if so are p and o by Part (ix) 
of the same proposition. Furthermore, by Exercise 27 in Chapter 8, we can 
identify B(H) @aig B(K) as a *- subalgebra of B(H ® K) by identifying T @ S € 
B(H) @aig B(K) with the element of B(H ® K) having the same notation given 
by (T® S)(€ @n) =TC® Sn for all ¢ € H,7n € K. Hence, in the sequel we view 
the *-homomorphism p ® o as a map 


pP®0:A@agB-> BHH®K), 


which is thus a representation of A @alg B on H ® K, faithful if so are p and o. 
From Proposition 13.3 we infer the following. 


Proposition 13.6. Let p,o be representations of C*-algebras A,B on Hilbert 
spaces H,K, respectively. The formula 


Xoo =I 27)\(Dllamex)  (X € A Sag 8) 


defines a C*-semi-norm ||-|| 


faithful. 


po On A @aig B, which is a C*-norm if p and a are 


Faithful representations of A and 6 do exist by the non-commutative 
Gelfand—Naimark theorem. As a result, we can state the following. 


Corollary 13.7. For C*-algebras A and B there is at least one C*-norm on 
A Wale B. 


We denote by A @p,2 B the C*-completion of (A @aig B,||-||,9,) for faithful 
p,o as in Proposition 13.6. Then A@®,,, 6 is naturally identified with the closure 
in B(H ®K) of the algebraic tensor product p(A) @aig o(B). 

Proposition 13.6 raises the following question: does the C*-norm ||-|| 2, really 
depend on p and o? The answer is negative. We prove this in Section 13.1.5. 


390 13. Constructions of C*-algebras 


13.1.3 The maximal tensor product 
The results of Section 13.1.2 allow a clear identification of the largest C*-norm 


on A Wale B. 
Theorem 13.8. Let A,B be C*-algebras. The formula 


IX ll max = sup { ||7(X)||; 7 is a representation of A @aig B} (5) 


max 


for X € A ®@aig B defines a C*-norm ||-|| max 
every other C*-norm ||-|| on A @aig B: ||X|| < ||X]| 


on A @aig B, which is larger than 
for all X € A alg B. 


max 


Proof. First, for X € A ®aiz 8, the supremum in (5) is finite, because writing 
X = SY", a ® bj, for every representation 7 of A @aig B we have ||(X)|| < 
1 Mleill_, [lBillg by Corollary 13.5. 

The function ||-||,,4x G@ominates every C*-norm ||-|| on A @aig B by 
Proposition 13.3. And since there exists at least one C*-norm on A @alg B by 
Corollary 13.7, we deduce that ||X'||,,,, > 0 for all 0 # X € A Garg B. 

The rest of the conditions for ||-||,,,,,, to be a submultiplicative norm on A@aig 


B that satisfies the C*-identity are easily checked. 


max 


Remark 13.9. In (5) it suffices to take just non-degenerate representations 7. 


The C*-norm ||-||,,,4, is called the maximal C*-norm, and the associated 
C*-completion of A @alg B is called the maximal tensor product of A and B and 
is denoted by A @max B. 

The maximal tensor product is “universal with respect to the representations 
of A Wale B”. 


Corollary 13.10. Let A,B be C*-algebras. Every representation of A @alg B 
extends to a representation of A @max B. Conversely, every representation of 
A @max B arises this way from a unique representation of A @aig B. 


Proof. If 7 is a representation of A@,),B on a Hilbert space H, then ||7(X)|] < 
|X || max DY the definition of ||-||,,,,- In other words, 7 is contractive as a map 
from (A @alg B,|I-lmax) to B(H). It thus extends uniquely to a bounded map 
from A @max B to B(H), which is a representation of A @max B. The converse 
statement is obvious. 


13.1.4 Tensor products of bounded linear functionals 


We prove several technical results on the existence of “tensor products” of 
bounded linear functionals, which are used in Section 13.1.5. 

The reader should not be confused by w ®7 being given several meanings 
depending on the context: an element of B(H ®K), (Lemma 13.11), a bounded 
linear functional on the (norm-) closure of B(H) @ag B(K) C B(H ® K) 
(Lemma 13.12) and on A ®,,~ B (Proposition 13.13), and of course a functional 
on various algebraic tensor products. These are related to one another in obvious 
ways. For instance, if w € B(H), and 7 € B(K),, then the functional w @ T of 


13.1. Tensor products of C*-algebras 391 


Lemma 13.11 extends the functional w®7 of Lemma 13.12, which in turn extends 
the functional w ® 7 on B(H) @aig B(K). 


Lemma 13.11. Let H,K be Hilbert spaces. Let w € B(H), and 7 € B(K),. 
There exists a unique linear functional w @7 € BCH ®K), satisfying 


(w @ T)(a ® b) = w(a)r(d) (Va € B(H),b € B(K)). (6) 
Additionally, ||w ® 7|| = ||w| ||7||, end w @ 7 is positive if w,7 are positive. 
Proof. By Corollary 12.27, we can write w = >, w¢,, and T = 


rei Wer nt, (with convergence in the norm topology of B(H)* and B(K)*, 
respectively) for sequences {¢,}°, and {nn}°, in H and {C/,}°_, and 
{n},},-_, in K satisfying 


ee) ee) lee) ee) 
2 2 2 2 
So Meal? So Ima? SS Wall”: SS etal” < 00 
n=1 n=1 m=1 m=1 


and 
ee) co 
Nol] = So UGall Mmalls Url] = So UCAI Mend 
n=1 n=1 


and moreover, if w and 7 are positive, we can arrange that ¢, =, and ¢), = 7, 
for all n,m € N. For n,m € N consider the vectors ¢, ®¢/,, and n, @7,, in H@K. 
We have 


>, IGedg l= >> Grieg’ = (32 ct” (3: teal?) < 00, 


n,m=1 nym=1 n=1 m=1 


and similarly oe at Itin ® Mall” < oo. As a result, the series 


Viem=1 Cn 2C,m@n, converges (absolutely) in B(H @ K)* to an element of 
w@rT of BH @K),. For a€ B(H) and b € B(K) we have 


(W@7)(4@b)= SP ((4)Cn @ Gn): In @ Mn) 


= s (Cn ® BG Min ® Nin) 


= S- (a¢n; mn) (Oe ai Na) 


392 13. Constructions of C*-algebras 


Also 
oo oe) 
IWOrIS So lecoch many ll = D2 bn @ Gall lm @ Mul 
n,m=1 n,m=1 
oo 


[Grall Moral Ultra UA 


( IInl i) (> IIGrall ia) = |lell II7ll- 


On the other hand, it follows from (6) that ||w|| ||7|| < |]w@7||, because if 
a € B(H) and b € B(K) are of norm 1, then so isa@beE B(H@K). 

If Gn = M and ¢), =n, for alln,m EN, then ¢, ®@C), = nm, © m7, for all 
n,m €N, and hence, w ® 7 is positive. 

The functional w@7 is unique due to (6) because it is ultraweakly continuous 
and B(H) @alg B(K) C B(H@K) is ultraweakly dense in B(H @K) by Exercise 1. 


l| 
ls 


Lemma 13.12. Let H,K be Hilbert spaces. Let w € B(H)* and 7 € B(K)*. 
There exists a unique bounded linear functional w @ 7 on the (norm-) closure C 
of B(H) @ag B(K) C BAH @K) in BH ®K) satisfying 


(w ®@T)(a @ b) = w(a)r(b) (Va € B(H),b € B(K)). (7) 
Additionally, ||w ®T|| = ||w|| \|T||, and w @T ts positive if w,7 are positive. 


Proof. By Goldstine’s theorem (5.25), there are nets {wa} ec, in B(H), and 
{Ta} gen in B(K)s, bounded by ||w|| and ||7||, and weak*-converging to w and 
T, respectively. Moreover, if w and 7 are positive, we can choose these nets to 
consist of positive functionals by Exercise 14 in Chapter 12. Consider the net 
{Wa ® TB} (o,8yeaxp i BCH ®K), defined by Lemma 13.11. It is bounded by 
||w|| ||7||, and thus has a weak*-cluster point y in B(H ® K)*. If w and 7 are 
positive, so is y. 

Let w@T := ylc. It is evident that (7) holds. Also ||w @7|| < ||y|] < |||] ||z 
On the other hand, it follows from (7) that ||w|| ||7|| < |lw @ 7]]. 

Finally, uniqueness of w ® 7 is a consequence of (7), the continuity of w @ Tr 
and B(H) @aig B(K) C B(H ® K) being (norm-) dense in C. 


We are ready to prove the existence of certain bounded linear functionals on 
A ®p,0 B. 


Proposition 13.13. Let A,B be C*-algebras, p,o be faithful representations of 
A,B on Hilbert spaces H,K, respectively, and w € A*,7 € B*. Then there exists 
a (unique) functional w ® T € (A @®p,o B)” satisfying 


(w@®T)(a@ b) = w(a)r(b) (Va € A,b€ B). (8) 


13.1. Tensor products of C*-algebras 393 


Furthermore, ||w ® T|| = ||w|| ||7||, and w @ 7 ts positive if w,T are positive. 


Proof. Identify A®,,,B with the closure of (9@c)(A@aigB) = p(A) @aigo(B) in 
B(H®K) and write C for the closure of B(H) @aig B(K) in B(H@K). Evidently, 
A@peB EC. 

Use the Hahn—Banach theorem to extend wop~! € p(.A)* and Toa! € o(B)* 
to elements @ € B(H)* and 7 € B(K)* with the same norms as w and 7, 
respectively, and being positive if w and 7 are positive (the last part is by virtue 
of Part (vi) of Theorem 11.20). 

Let w @T be the restriction of @ @7 € C* of Lemma 13.12 to A@»,_ B. Then 
wT E(A@®,5 B)", ||w @7|| < ||| ||7||, and w@7 is positive if w, r are positive. 
As before, it follows from (8) that ||w|| ||7|] < ||w @ 7], thus |jw @ || = |\wI] ||z]]. 
Uniqueness is clear. 


13.1.5 The minimal tensor product 


We discussed in Proposition 13.6 the C*-norm ||-||,.9, on A @alg B constructed 
from faithful representations p,o of A,B, respectively. If w,7 are states of A,B, 
respectively, then the functional w®7 constructed in Proposition 13.13 is a state 
of A ®p,¢ B. Theorem 13.14 shows that states of this form are enough for the 
computation of norms in A ®,,, 8, leading to the conclusion that A @p),o B is 
independent of the faithful representations p,o. 


Theorem 13.14. Let A,B be C*-algebras. 


(i) Let p,o be faithful representations of A,B on Hilbert spaces H,K, 
respectively. Then for each X € A® p> B we have 


|X|? = wn we ae Se Se € S(A),7 € S(B), 


ee (w@r)(Y*Y) 
Y € A Wag B, (w®T)(Y*Y) > ob (9) 


(ti) The C*-norms ||-\|,g¢ on A @aig B are independent of the faithful 
representations p,a. In other words, the tensor product C*-algebras A®pB 
are independent of p,o. 


Proof. (i) Let X € A ®,, B. For every Y € A@ po B we have Y*X* XY 
ese Y*Y, so by positivity of w@r € (A @p,¢ B)* we get (w@r)(Y*X*XY) 
|X |l2no (w @ T)(Y*Y). This proves that ||X||3. > sup {---} in (9). 

To show the converse, we can and will assume that p and o are non- 
degenerate. Recall from Exercise 23 in Chapter 11 that every non-degenerate 
representation of a C*-algebra is the direct sum of cyclic ones (up to unitary 
equivalence). So, let {Pa}ye, and {og}. be families of cyclic (not necessarily 


faithful) representations of A and B, respectively, such that p = )),¢,; Pa and 


< 
< 


394 13. Constructions of C*-algebras 


7 = i gcz Sop. Hence the representation p@ o of A @aig B is the direct sum of 
the representations {pa @ 78} (4 ger. y- Consequently 


IX\eo0= sup [X12 oo. (10) 
eee oer: ARQEs 


Let (a, 8) € Ix J. The representations py and og of A and B on Hilbert spaces 
Ha C H and Kg C K have cyclic unit vectors ¢, and ng, thus the subspaces 
Pal(A)Ca = p(A)Ga and o9(B)ng = o(B)ng are dense in Hq and Kg, respectively. 
Consider the states wy := we, op and Tg := Wy, oo of A and B, respectively. 
Then Wa ® Tg = We,@ng 0 (9 @ 7) on A @aig B. Hence 


(wa ® T8)(Z*Z) = (0 @0)(Z)) (Ca @ nal? (WZ EABaig B). (11) 


Notice that 


{((0 @ o)(Y)) (Ca @ n@)i¥ € A @atg B} 
= p(A)Ca @aig 7(B)ng is dense in Hy ®Kg. (12) 


Thus, by the definition of the operator norm, 


MOSM yee ey ors] 
er : 


by) Io 2 o)(X)) ((p@ oY) Ga @ NII”, 
I @ oY) (Ga ® na) IP 


IX? ay = Mea ® o6)(X)I? = wo 


Y € A aig B, ((0 @ o)(Y)) Ga @ na) # | 


by (11) (We @ Ta)(Y*X* XY) . 
= 5 5 Y a ; a Y Y . 
Ss { (wa @ Ta) (Y*Y) ; A® Ig B (w ® Tp)( )>0 
In combination with (10), this shows that IX Ioe0 < sup {---} in (9). 
(ii) For X € A @aig B, the right-hand side of (9), which equals se by 
Part (i), is independent of the faithful representations p, oc. 


The C*-norm discussed in Theorem 13.14 is called the minimal or the spatial 
C*-norm of A @aig B and it is denoted by ||-||,,i,- The associated C*-completion 
of A Gaig B is called the minimal or the spatial tensor product of A and B 
and is denoted by A @min B. The word “spatial” comes from (Hilbert) space: 
indeed, this C*-norm is obtained by embedding A and 6 in B(H) and B(K) for 
Hilbert spaces H and K, respectively (in any way we choose!), and then naturally 
embedding A @aig B in B(H ® K) from which we take the norm. 


Corollary 13.15. If p,o are representations of C*-algebras A,B on Hilbert 
spaces H,K, then p®oa:A@alg B > B(H®K) extends to a *-homomorphism 


13.1. Tensor products of C*-algebras 395 


from A @min B to BCH ®K), also denoted by p®o, which is faithful if p,0 are 
faithful. 


Proof. This is clear when p,¢ are faithful. The non-faithful case is left to the 
reader. 


The reason for the adjective “minimal” is the following fundamental theorem, 
which we will not prove. 


Theorem 13.16 (Takesaki’s theorem). Let A,B be C*-algebras. 


(i) The minimal C*-norm on A ®aig B is the smallest of all C*-norms on 
A Gale B. 


(ii) If either A or B is commutative, then there is only one C*-norm on A@aigB. 


Corollary 13.17. Let A,B be C*-algebras. Every C*-norm on A Qaig B is a 
cross norm. 


Proof. One inequality comes from Corollary 13.5 (which was used to define 
the maximal tensor product). The other comes from Part (i) of Theorem 13.16, 
because the minimal C*-norm is a cross norm. 


Theorem 13.8 and Part (i) of Theorem 13.16 imply that every C*-norm ||-|| on 
A @alg B lies between the minimal and the maximal ones: ||-||,,:, < |I-ll < |l-Ilmax-: 

Observe that if C is a *-algebra with C*-norms ||-||, , ||-||, and associated 
C*-completions C,,C2, and if ||-||, < ||-||,, then the identity map of C extends 
to a (contractive) *-homomorphism from C2 onto Ci, for the image of a 
*-homomorphism between two C*-algebras is always a C*-algebra. As a result, 
we have a canonical surjective *-homomorphism A @max 6B > A @min 6, and 
for every C*-completion A ® B of A @aig B there are canonical surjective 
*-homomorphism A @max B ~ A®B and A®B —- A @min B such that the 
following diagram commutes: 


A ®max B — > A@®B —> A Q@Qmin B. 
Et ee 


From this and Corollary 13.15 we infer the following. 


Corollary 13.18. Let A,B be C*-algebras and A ® B be a C*-completion of 
A @alg B. If p,o are representations of A,B on Hilbert spaces H,K, then p@o: 
A @®aig B > B(H @K) extends to a *-homomorphism from A®B to BH®@K). 


A C*-algebra A such that for every C*-algebra B there is a unique C*-norm 
on A @alg B is said to be nuclear. Part (ii) of Theorem 13.16 says that every 
commutative C*-algebra is nuclear; Section 13.1.6 provides an explicit realization 
of the tensor product in this case. 


396 13. Constructions of C*-algebras 


13.1.6 Tensor products by commutative C*-algebras 


Let 2 be a locally compact Hausdorff space and A be a C*-algebra. Let 
Co(Q, A) := {F : OQ > A; F is continuous and vanishes at infinity} 


(the vanishing at infinity condition means that for every € > 0 there is a compact 
K CQ such that ||/F'(x)|| < ¢€ for all « € Q\K). Then Co(Q, A) is a C*-algebra 
with respect to the pointwise operations and the norm 


|Fll = maxlF@ly (Fe Co(2,A)). 


This section shows that the unique C*-algebraic tensor product of Co(Q) and A 
is naturally isomorphic to Co(Q, A) (cf. Exercise 29 in Chapter 8). 

For f € Co(Q) and a € A let fa: 2 > A be given by x > f(ax)a (a € 
Q). Evidently fa € Co(Q,A). The map from Co(Q) x A to Co(Q, A) given by 
[f,a] > fa is bilinear, so there is a (unique) linear map UV : Co(Q) @aig A > 
Co(Q,A) mapping f ® a to fa for all f € Co(Q) and a € A (see Exercise 22 in 
Chapter 8). In fact, W is a *-homomorphism. It is injective, because if )>;"_, f;® 
a; € Co(Q) @aig A and S>"_, fia; = 0 in Co(O, A) then assuming, as we may, 
that a1,...,@, are linearly independent, we get f;(a) = 0 for all 1 <i<n and 
x €Q, hence surely 57", fi @ a; = 0. 

Let us show that © has dense range. Fix F € Co(Q,A). Given € > 0, let 
K CQ be a compact set such that ||F(«)|| < € for each x € O\K. For every 
x € K let U, C 2 be an open neighborhood of x such that ||F(y) — F(x)|| < € 
for all y € U,. By compactness of K there are 21,...,%, € K such that Vj := 
Uz,,---,Vn := Uz, is a finite open cover of K. By Theorem 3.3 there is an 
associated partition of unity, namely, hi,...,hn € C.(Q) with values in (0, 1] 
and supports contained in Vi,...,V,, respectively, such that hy +...+h, =1 
on K and 0 < hi +...+h, < 1 everywhere. Consider the function G := 
Wo, hi @ F(xi)) = P_, hi (ai). We will show that ||F — G|| < 36. 

Indeed, let x € Q and J := {1 <i<n;x€ Vj}. Then for each 7 € I we have 
||F'(a) — F(a;)|| < € and for each i ¢ I we have h(a) = 0. If « € K, then 


rere) = 37 4 @) =1, ths 


F(x) — G(@) || = 0 aie) (F(@) — F(ai))|] S Do hile) IF@) - FO) || 
icl icl 
< «S> hi(a) =e 
i€l 


If x € Q\K, then ||F(z)|| < , and either J = # and hence G(x) = 0, or I 4 9, 
in which case for each i € I we have ||F'(x;)|| < ||F'(x)|| + € < 2e, and therefore, 


F(x) — G(x)I| < F(@)I| + IE) || < € + 97 bila) || F(ea)I| < 3¢. 
iél 


The foregoing proves that || F’ — G|| < 3e, as desired. 


13.2. Group C*-algebras 397 


In conclusion, the formula 
Cll = ¥O)llogaay (EC € Co() @aig A) 


defines on Co(Q) @aig A a C*-norm, whose associated C*-completion can be 
identified with Co(Q,A). By Theorem 13.16, this is the unique C*-norm on 
Co(Q) @aig A. As a result, both Co(Q) @max A and Co(Q) @min A are canonically 
*-tsomorphic to Co(Q, A). 

We give another proof of the isomorphism between Co(Q) @min A and 
Co(Q, A) in Exercise 6. 


13.2 Group C*-algebras 


Throughout this section let G be a locally compact topological group with 
identity e. We construct a C*-algebra C*(G) whose structure reflects not only 
the topology on G but also its algebraic structure. This is accomplished similarly 
to the construction of the maximal tensor product. We begin with C,(G) 
endowed with a suitable *-algebra structure in a way that its L'(G)-contractive 
representations are in bijection with the unitary representations of G. Such 
representations are used to produce a C*-norm on C,.(G) whose completion 
C*(G) has a universal property. 

Denote by y a left Haar measure on G and by 6 : G > (0,00) the modular 
function of G (see Section 4.5), and write L?(G) for L?(G, 2). 

For f :G— C and t € G, the left t-translate of f is f, : G— C defined by 
fi(s) := f(ts) (s € G). By the definition of yu, if f € L'(G) then f; € L'(G) and 
Jofdu = fo fedp for all t € G. It follows that for p € [1,0o) and t € G, the 
map f > f; is a linear isometry of L?(G) onto itself, and in the particular case 
when p = 2, this map is unitary. 

Also, for all f € L1(G), the function G > C given by t > 6(t7)f(t7+) 
belongs to L'(G) and its integral equals [, f du. 

Moreover, if p € [1,00) and f € L?(G), then the function G + L(G) given 
by t > f; is continuous. This is proved precisely like Exercise 1 in Chapter 3 
using the left translation invariance of pL. 

For f,g € C-(G) define functions f * g (the convolution of f and g) and f* 
in C.(G) by 


(fe g(t d= f soso edule = f flea (s) du(s), 
f(t) <6) FED 


The reader can verify that these operations turn C.(G) into a *-algebra. 


(tEG). 


Example 13.19. Assume that G is discrete. Then the complex group algebra 
C[G] (see Section 13.1.1) is isomorphic to C,(G) as algebras when for g € G, 
the element g € C[G] is identified with the indicator function of {g}. Under this 
identification we have g* = g~' for g € G (6(-) = 1 as G is discrete). 


398 13. Constructions of C*-algebras 


Denote by Y the directed set of all open neighborhoods of e ordered by reverse 
inclusion. For V € V choose a function fy € C/(G) that is supported in V and 
such that {, fydu = 1. We obtain a net {fy}yey- Such a net is called an 
approximate identity for C-(G), because for each g € C.(G) we have 


(fy *9—g)(t) = - fv(s) (g(s74t) — g(t)) du(s) (Vt @), 


so Fubini’s theorem implies that 
Ilfv *9—Sllae S [ [ |f-(s) (g(s 44) — g(t) | dus)dy(t) 
= f ltl [ loo) - 9] auldn(s) 
G G 


<s -1- —— 0 
< sup |lgs 1 Ila) vey 


by the continuity of the map G > L'(G) given by s > g,-1. 


Remark 13.20. We could have replaced C,(G) by L1(G) throughout this 
section. We use C.(G) because at times it is technically more convenient. 


13.2.1 Unitary representations 


For a Hilbert space H write U(H) for the group of unitary operators in B(H). A 
unitary representation of G on H is a group homomorphism p : G + U(H) that 
is s.o.-continuous, in other words, continuous when U(H) is equipped with the 
relative s.o.t. inherited from B(H). Note that p(t~') = p(t)* for all t € G since 


p(e) =I. 


Example 13.21. The unitary representation of G on C given by t > 1 is called 
the trivial representation of G. 


Example 13.22. Fort € G let A; be the operator on L?(G) given by A,f := fi, 
that is, (A: f)(s) := f(t~+s) (f € L7(G),s € G). As explained previously, Az is 
well defined and unitary. The map A: G — U(L?(G)) given by X(t) = 
(t € G) is a group homomorphism, as a simple calculation shows. It is also s.o.- 
continuous by the continuity of the inverse map on G and the map t — f; for 
each fixed f € L?(G). Thus, is a unitary representation of G on L?(G). It is 
called the left regular representation of G. 


Let p be a unitary representation of G on a Hilbert space H and let f € C.(G). 
Set 
ais) = | Feplt) dule), 
G 
where the integral convergence in the w.o.t. This means that the function from 


H x H to C that maps the pair [¢,7] to f, f() (o(t)¢, 7) du(t) (the integrand 
belongs to C.(G)!) is linear in the first variable, conjugate linear in the second 


13.2. Group C*-algebras 399 
variable, and bounded by ||fl|p1;g) (as |lp@|| = . for all t because p(t) is 
unitary). Thus, there exists a (unique) eee = Ja f(t)e(t) du(t) € BCH) 
such that ((f, f(é)e(t) du(t)) ¢.n) = fo FH (p(t), . du(t) for all ¢,n € H, and 


we have ||A(f)Il < If llz1(@): 
The function f — p(f) is evidently linear. For all ¢,7 € H we have 


aen) = [to a)¢,n) du(e) 


Eee (P(9)¢, e(s)*n) du(s) 
= [ 10) (90 (000, oc") ute) a) 
= [ 168) ( [9 (ovse.6.n) aut) ants. 


(YF *g)C.n) = i: (f * g)(t) (plE)C,n) dult) 


=| (| sr9 F(s)g(s~'t) du(s (8)) (06.1 dnt) 
=[f i(s (fa Cal (ol. 1) du(t)) du}. 


where in the last step we used Fubini’s theorem. By the left invariance of p, 


On the other hand, 


(alr «argv =f 46) ( [serosa an(d) OCHO: 
proving that p(f * g) = 6(f)e(g). Similarly, for all ¢,7 € H we have 


"Yon = [roe ()¢,n) du(t) = f ow) FED (p(t)6,n) dul?) 


- [FH 1)¢,n) dyu(t) = [ FH ) (6, a(t) 
= Aa) = (AFG), 


thus, 0(f*) = p(f)*. This proves that f is a representation of C.(G) on H. 
Furthermore, if {fv}yey, is an approximate identity for C.(G), then 
limvey p(fv) = J in the w.o.t. because 


\(A(fv 6.7) — (Cm) = I fv (t) (Co(t) — 1)¢, 7) 0) 


< sup |((o(t) — Dem) Gap 9 


(Vo, € H) 


400 13. Constructions of C*-algebras 


since p is s.o.-continuous. This implies that p is non-degenerate. The same 
reasoning shows that limvey #((fv);-1) = p(t) in the w.o.t. for each t € G, 
so that the unitary representation p of G can be reconstructed from the 
representation f of C.(G) it induces. We proved the following. 


Proposition 13.23. For a unitary representation p of G on a Hilbert space 
H, the function p : C.(G) > B(H) is a non-degenerate representation of the 
«-algebra C.(G) on H satisfying ||P(f)|| < lIfllzicqy for all f € C.(G). 


Conversely, suppose that @ : C.(G) — B(H) is a non-degenerate 
representation of the *-algebra C.(G) on a Hilbert space H satisfying ||o(f)|| < 
Il fllca¢q for all f € Ce(G). We will demonstrate that it is induced as shown by 
a unitary representation of G. 

Pick an approximate identity {fy }yey for C.(G). Let t € G. We claim that 
the net {0 ((fv),)}yey converges in the s.o.t. of B(H). To prove the claim, note 
that |lo((fv),)\l < fv ellnrcey = |lfvllzic@) = 1 for each V, thus the net is 
bounded. Also, for each g € C.(G) we have 


o((fv),) o(9) = o((fv), * 9) = e((fv *9),) [a> alg) 


(in the operator norm!) because fy *g rare thus (fv * 9); Tay ge in D1(G). 
E E 


In particular, 0 ((fv),) o(9) rs o(g:) in the s.o.t. Since @ is non-degenerate, 
E 


H is a Hilbert space, and {e((fv),)}ye, is bounded, we infer that 
{o((fv)i)}yey converges in the s.o.t. (and the limit does not depend on the 


choice of {fy }yey): 
For t € G write o(t) € B(H) for the s.o.-limit of {0 ((fv),-1)}yey- It is a 


technical exercise, which we skip to show that t > o(t) is a unitary representation 
of G on H such that (g)~ = 9. To summarize: 


Proposition 13.24. There is a one-to-one correspondence between the 
unitary representations p of G on a Hilbert space H and the non-degenerate 
representations 0 of the *-algebra C-(G) on H satisfying ||o(f)| < llfllzacq for 
all f € C.(G). It is given by the maps p > p and 0 > @, which are inverse to 
one another. ° 


13.2.2 The definition and representations of the group 
C*-algebra 


We assume familiarity with C*-norms and C*-completions of *-algebras (see 
Section 13.1.2). 
For each f € C.(G) let 


|| f|| <= sup {||6(f)||; ¢ is a unitary representation of G}. 


By Proposition 13.23, ||f|| < If llzace)s so in particular the sup is finite. By 
Proposition 13.24, || f|| equals the sup of all numbers ||o(f)|| where @ goes over 


13.2. Group C*-algebras 401 


the non-degenerate representations of the *-algebra C.(G) satisfying ||o(g)|| < 
Ilgllz1¢qy for each g € C,(G). 

It is immediate that ||-|| is a C*-semi-norm on C,(G). To prove that it is a 
C*-norm, we show that if 0 4 f € C.(G) then A(f) 4 0, where 2 is the left 
regular representation of G on L?(G) (Example 13.22), thus ||f|| > ||A(f)|| > 0. 
Indeed, for g,h € C.(G) C L?(G) we have by Fubini’s theorem 


CGuhae= L f(t) (A(t)g, h) dyi(t) = : f(t) (r-1h) dut) 


= [ 1 (f oem) aus) ant 
=) ( [toa an(t) RG) du(s) = (F * 9, A) zacey 


proving that \(f)g = f *g by the density of C.(G) in L?(G). Defining g € C.(G) 


by g(t) = f(E2), t € G, we have (A(f)g)(€) = (f * 9)(e) = IIfll2a¢c) > 0- This 
proves that \(f) 4 0. 


Definition 13.25. The C*-completion of C.(G) with respect to the C*-norm 
||-|| above is called the (universal or full) group C*-algebra of G. It is denoted by 
C*(G). 


If p is a unitary representation of G on a Hilbert space H, then the inequality 
|P(-)|| < ||-|| on C.(G) proves that p : C.(G) + B(H) extends uniquely to a 
bounded linear map from O*(G) to B(H), also denoted by /, which is clearly 
a non-degenerate representation of C*(G). Conversely, given a non-degenerate 
representation ® of C*(G) on a Hilbert space H, the restriction ®|¢,(q) is a non- 
degenerate representation of C.(G) and ||®(f)|| < IIfll < Ifllzi(g) for all f € 
C.(G), so by Proposition 13.24, ®|¢,(g) equals p for some unitary representation 
pof Gon H. 

So, continuing Proposition 13.24, there is a one-to-one correspondence 
between the unitary representations of G and the non-degenerate representations 
of the C*-algebra C*(G). We can loosely say that C*(G) is “universal with 
respect to the unitary representations of G”. 


13.2.3. Properties of the group C*-algebra 


We list several properties of group C*-algebras, which are important in quantum 
group theory. 


Theorem 13.26. 
(i) C*(G) is commutative iff G is abelian. 
(ti) C*(G) is unital iff G is discrete. 


(iti) C*(G) admits a character, that is, a non-zero homomorphism from C*(G) 
to C. 


402 13. Constructions of C*-algebras 


(iv) Assume that G is discrete. There exists a unique bounded linear map 
A: C*(G) > C*(G) @max C*(G) that, using the identification of C.(G) 
with C[G] as in Example 13.19, mapst € G tot @t. It is an isometric 
*x-homomorphism. 


Proof. (i) The C*-algebra C*(G) is commutative iff the *-algebra C.(G) is 
commutative. If G is abelian, then in particular it is unimodular, and for all 
f,g € C-(G) we have 


(f(t) =f A0esia(o) dus) =f ftes“a(s) a) 
= i g(s) f(s Mt) du(s) = (g * f)(t) 


proving that f * g = g* f. Thus C.(G) is commutative. The converse is left as 
an exercise. 

(ii) If G is discrete, then identifying C.(G) with C[G], the element e € C[G] 
is evidently a unit for C[G], thus for C*(G). For the converse, see Exercise 21. 

(iii) Denote by 1 the trivial representation of G (Example 13.21). Then 1 is a 
non-degenerate (in particular, non-zero) representation of C*(G) on C, that is, 
a character of C*(G). 

(iv) Uniqueness follows from the density of C[G] in C*(G). 

To prove the existence of the bounded linear map A we should show that for 
each )o.cq art in C[G] (identified with C.(G)) we have 


S/ ar(t ®t) So aut 


teG teG 


(vt € G), 


(2) 


C*(G)@maxC*(G) c*(G) 

Let 7 be a non-degenerate representation of C*(G) @aig C*(G) on a Hilbert 
space H. Let 71,72 be the (unique) non-degenerate representations of C*(G) 
on H. with commuting ranges such that 7 = 71-72 (Proposition 13.4), and 
for i = 1,2 let p; be the (unique) unitary representation of G on H such that 
1; = ~; (see the paragraph after Definition 13.25). It follows from the proof of 
Proposition 13.24 that the ranges of p;, 92 commute. Consequently, the function 
pi: p2: G > U(H) given by (p1 - p2)(t) := pi(t)pe(t), t € G, is a unitary 
representation of G on H. 

For each )U cg art € C[G] we have 


“(> az(t @ t) | - Ps ay (t)T2(t > 2 a4 P(t) po(t => ay p1(t) po(t 


tEeG tEG tEeG tEeG 
= S/ an(p1 « p2)(t) = (p1- pa) (= 5 
tEeG tEG 


Therefore, by the definition of the norm on C*(G), 


“{Zato)|-|am (Eo) 


S- ayt 


teG 


o*(G) 


Exercises 403 


This holds true for each non-degenerate representation 7 of C*(G) @aig O*(G), 
thus verifying (2) by the definition of the maximal C*-norm (Theorem 13.8). 

The bounded linear map A is easily seen to be a +homomorphism. It remains 
to show that A is injective (thus isometric). The character 1 € C*(G)* of C*(G) 
discussed in Part (iii) satisfies 1(t) = 1 for all t € G (we still identify C[G] 
with C.(G) and embed it in C*(G)). Consider the slice map 1 ® idox(g) € 
B(C*(G) @max C*(G), C*(G)), see Exercise 5. For all t € G we have 


(1 @ idgs(ay) 0 A) (tf) = (1 @ ide- (ay) ( @ t) = (1) t= t. 
By the density of C[G] in C*(G) and by the linearity and boundedness of the 


operators in both sides we deduce that (1 @idco«(g))o A = idg~(g). In particular, 
A is injective. 


Remark 13.27. Since C*(G) is, in general, not commutative, the existence 
of a character on C*(G) is not clear. There are “many” C*-algebras without 
characters! 

Remark 13.28. In the proof of Part (iv) the discreteness of G was required for 
the definition of A (by the formula }),<¢ art + Vieg ar(t@t)). However, notice 
that the paragraph starting with “Let a be a non-degenerate representation of 
C*(G) @aig C* (G)” remains valid when G is not discrete. And indeed, there is an 
extension of Part (iv) to general locally compact topological groups G with the 
co-domain being a canonical C*-algebra strictly larger than C*(G) @max C*(G). 


Exercises 


Tensor products of C*-algebras 
Unless otherwise stated, A,B are C*-algebras. 


1. Let H,K be Hilbert spaces. Recall that for x,y € H we write 4, (here 

07, for clarity) for the element (-,7) y of B(H). 

(a) Let x,y € H and w,z € K. Prove that 6%, @ 0%, = oS ei 
(where the tensor in the left-hand side is given by Part (vi) of 
Proposition 8.18). 

(b) Let A € B(H@K). Assume that Tr((6#/, @0%,,,)A) = 0 for all x,y € H 
and w, z € K. Prove that A = 0. (See Section 12.6 for the Tr notation.) 


(c) Use Theorem 12.25 and separation to deduce that B(H) @alg B(K) C 
B(H ® K) is ultraweakly dense in B(H ® K) (hence also w.o.-dense 
and s.o.-dense by Exercise 15 in Chapter 12). 


2. (a) Let C be a *-algebra. Assume that ||-||, is a norm on C making it a 
C*-algebra and ||-||, is a C*-norm on C. Prove that ||-||, = ||-[|,- 


(b) Prove that for alln € N, C” and M,(C) are nuclear C*-algebras. 


404 


13. Constructions of C*-algebras 


. The minimal and the maximal tensor products of C*-algebras are 


commutative, associative, and distributive. Provide precise statements of 
this and prove them. 


. (Tensor products of homomorphisms.) Let Ai, A2,B1, Bz be C*-algebras. 


Suppose that y: A, > Ag and w : B, — Bz are *-homomorphisms. 


(a) Prove that the *homomorphism y ® ~ : Aj @alg By 4 Az @aig Bo 
extends uniquely both to a bounded linear map A1@minB1 > A2@min 
By and to a bounded linear map A, ®max By - A2 ®max Bo, also 
denoted by y ® ¥, that these map are *-homomorphisms, and that 
they are surjective if y, w are surjective. 

(b) Prove that if y, a are injective, then so is the map y®w : Ai@minBi > 
Ao @min Bg. (This is not true for the maximal tensor product.) 


. (Slice maps.) Let w € A*. Denote by idg the identity map on B. Prove 


that the linear map w@idg : A@aigB — B (sending a®b to w(a)b) extends 
uniquely both to a bounded linear map A ®@min 6 — B and to a bounded 
linear map A @max 6 — B, also denoted by w @ idg. Prove that these map 
have norm ||w|| and that they are positive (i-e., send positive elements to 
positive elements) if so is w. 

(Hint: it suffices to prove the assertion for the minimal tensor product. 
In this case there are several similar ways to show existence, either by 
assuming first that w is positive and using the GNS construction, or by 
adapting the proof of Proposition 13.13 using Exercise 16.) 


. (Tensor products by commutative C*-algebras revisited.) 


As in Section 13.1.6, let Q be a locally compact Hausdorff space and A 
be a C*-algebra. In this exercise we show explicitly that Co(Q) @min A 
is x-isomorphic to Co(Q, A) canonically. 

Let p be the natural faithful representation of Co(Q) on 1?(Q) = Yo .cq OC 
given by “multiplication”: for f € Co(Q), p(f) takes Veg BAe € [7 (Q) 
to Vico Of (x)Ax € 17(Q). Let o be a faithful representation of A on a 
Hilbert space H. Consider the faithful representation 7 of Co(Q,A) on 
Yenen CH given by Co(Q,A) > F > Vig Go(F(z)) € BOC reg SH). 
Identifying 1?(Q) ®@ H with 0-9 ®H canonically, prove that (p ® o) 
(f @ a) = x(fa) for all f € Co(Q) and a € A. Deduce that the images of 
the faithful representations p ® o and m of Co(Q) @min A and Co(Q, A), 
respectively, on 1?(Q) @ H, are equal. 


. (Compare Exercise 28 in Chapter 8 as well as Exercise 17 here.) Let 


Q),Q2 be locally compact Hausdorff spaces. In this exercise we prove 
that Co(Q1) & Co(Q2) “ig” Co(Qi x Qs), where Co(Q1) ® Co(Q2) is the 
unique C*-algebraic tensor product of Co(Q1) and Co(Qz2); see Takesaki’s 
theorem. 

Precisely: for f € Co(Q1) and g € Co(Mz2), define f x g: OQ, x Q2 > C by 
[x,y] > f(x)g(y) (« € Or, y € Qe), and notice that f x g € Co(Qi x Qe). 


Exercises 405 


Prove that there exists a unique bounded linear map from C9(Q1) ®Co(Qz2) 
to Co(Qy X Q2) that sends f@g to f xg for all f € Co(Q1) and g € Co(Qz2), 
and that this map is a surjective *-isomorphism. 


. (Compare Proposition 13.4 and Corollary 13.10.) Let C be a C*-algebra 


and let 74 : A > C and mg : B > C be homomorphisms with commuting 
ranges. Prove that there exists a unique bounded linear map from A@maxB 
to C mapping a@b to 7_4(a)7,(b), and that this map is a *-homomorphism. 


Tensor products of von Neumann algebras 


10. 
11. 


12. 


13. 


Unless otherwise stated, R,S are von Neumann algebras acting on Hilbert 
spaces H,K, respectively. We define the (normal spatial) von Neumann 
algebraic tensor product of R and S, denoted by R ® S, to be the w.o.- 
(equivalently, ultraweak or s.o.-) closure of R @aig S in B(H ® K); for this 
we view R @alg S as a *-subalgebra of B(H) @aig B(K), which in turn is 
viewed as a ¥-subalgebra of B(H ® K). Then R ® S is a von Neumann 
algebra on H®K. Notice that R®S is the w.o.-closure of R ®minS, when 
the latter is viewed as sitting in B(H ® XK) using the given representations 
of R and S on H and K, respectively. 


. (a) Observe that B(H) 8 B(K) = B(H ® K) trivially by Exercise 1. 


(b) (Matrix representation of elements of B(H) © B(K) = B(H ® K).) 
Let {fj};¢, be an orthonormal basis of K. To every T € B(H ® kK) 
there is an associated matrix (Zi;), ;<; with entries in B(H) obtained 
by the identification of H @® K with >7,.;@H (see Part (iv) of 
Proposition 8.18), that is, Tj;¢ = (In ® (-, fi)) T(C @ f;) for i,j € J 
and ¢ € H, where the map Iz, ® (., fi) : H@®K > H is given 
by Part (vi) of Proposition 8.18. Prove that ||Tj;|| < ||Z'|| for all 
i,j € J and that T (yey @ fj) = Lies (Nyey Mads) @ fi for 
all yey OC] © Lijec PH. Prove that the map T — (Ti;); ;-7 is an 
injective homomorphism from B(H®XK) to the *-algebra M7(B(H)). 

(c) Prove that T € R® B(K) iff T,; € R for all i,j € J, and that 
TER® Clk iff there is S € R such that T;; = 6,5 for all i,7 € J 
(where 6 is Kronecker’s delta). 


For n € N, describe R ® C” and R © M,(C). 


Prove that the commutants of R ® Cl, and R ® B(K) inside B(H ® K) 
are R’ ® B(K) and R’ ® Clk, respectively. 


The von Neumann algebraic tensor product is commutative, associative, 
and distributive. Provide precise statements of this and prove them. 


(Tensor product of normal functionals.) Let w € R, and 7 € S,. Use 
Lemma 13.11 to prove that there exists a unique linear functional w @T € 


406 


14. 


15. 


16. 


17. 


13. Constructions of C*-algebras 


(R®S), satisfying (w@7)(A® B) =w(A)r(B) for all AG Rand BES. 
Also prove that ||w ® 7|| = ||w|| ||7|| and that w@7 is positive if so are w,T. 


Prove that span{w ®T;w € R.x,7 € S,} is norm-dense in (R @ S),. 


(Tensor products of normal *-homomorphisms; compare Exercise 4.) 
Recall Exercise 13 in Chapter 12. Let R1,R2,S1,S_2 be von Neumann 
algebras. Suppose that y : Ry > Re and w : Sy > So are ultraweakly 
continuous *-homomorphisms. Prove that the *-homomorphism y ® w : 
Ri @alg S1 + Rz @alg S2 extends uniquely to an ultraweakly continuous 
linear map R1 ® Sy + Rz ® So, also denoted by y @ w, that this map is 
a *-homomorphism, and that it is injective/surjective if so are y, w. 

(Hint: for existence: let X € R, ® S;. Take a bounded net {Xj}, in 
Ri @alg S; that converges ultraweakly to X in R; & S; (why does such a 
net exist?), and for w € (Rg), and 7 € (S2)x, prove that the scalar net 
{(w @T) ((p ® W)(Xa))}, converges. Now apply Exercises 4 and 14.) 


(Slice maps; compare Exercise 5.) Fix w € Rx. Prove that the linear map 
w@ids : R@aig S > S (sending A® B to w(A)B) extends uniquely to an 
ultraweakly continuous linear map R ® S > S, also denoted by w ®@ idg. 
Observe that 7 0 (w @ ids) = w @T, and prove that ||w @ ids|| = ||w|] and 
that w ® ids is positive (i.e., sends positive elements to positive elements) 
if so is w. 

(Hint: for existence: consider the Banach adjoint of the linear map from 
S, to (R@®S), given by T > w ®T.) 


(Compare Exercise 28 in Chapter 8 as well as Exercise 7 here.) 
Let (Q,A,p) and (0,B,v) be complete o-finite positive measure 
spaces. In this exercise we prove that, essentially, L°(Q,A,p) @ 
L~(0, B,v) “is” L°(Q x 0,A x B,y x v). Precisely: use Exercise 28 in 
Chapter 8 to prove that {M;; f € L™(Q, A, u)} © {M,; 9 € L™(O, B, v)} 
(acting on L?(Q,A,u) @ L?7(0,B,v)) is spatially isomorphic to 
{Mr; F € L®(Q x 0,A x B, x v)} (acting on L?(Q x 0, Ax B, ux v)). 


Group C*-algebras 


18. 


19. 


(a) Show that in the definition of a unitary representation one can replace 
the s.o.t. by the w.o.t., yielding an equivalent definition. 


(b) Prove that the left regular representation is continuous in the operator 
norm topology iff the group is discrete. 


This exercise explores C*(R), where the additive group R is considered 
with its usual topology. Recall from Exercise 7 in Chapter 2 or from 
Section 11.3.2 that the Fourier transform of f € L*(R) is the function 
f © Co(R) given by f(t) := Sa f (ue da (¢ € R). For every f € 
L1(R) M L7(R) we have f € L*(R) and ||f\lz2@) = Ifllz2q@- Thus, the 


Exercises 407 


20. 


21. 


22. 


function from L1(R)A L?(R) to L?(R) given by f + f extends (uniquely) 

to a unitary operator F € B(L?(R)). 

(a) Prove that for each f € C,(R) we have \(f) = F*M;F, where 
is the left regular representation of R and My € B(L?(R)) is the 
multiplication operator on L?(R) given by g > fg. Deduce that 
ACF) I = flu = supe [fl 

(b) Prove that for every unitary representation p of R and every 


f € Cc(R), we have |A(f)|| < [fllu- 
(Hint: Stone’s theorem (Exercise 9 in Chapter 10).) 


(c) Deduce that the norm in C*(R) of each f € C.(R) is equal to || fl x. 


(d) Prove that there is a natural (isometric) isomorphism between C*(R) 
and Co(R). 


Prove, similarly to Exercise 19, that there are natural isomorphisms 
between C*(Z) and C(T) as well as between C*(T) and C(Z). This 
requires introducing the Fourier transform on the appropriate groups. 


Let G be a locally compact topological group and A be its left regular 
representation. 


(a) The (generally non-unital) C*-subalgebra of B(L?(G)) generated by 
{A(f); f € Ce(G)} is called the reduced group C*-algebra of G and is 
denoted by C*(G). Explain why the map C,(G) 3 f — A(f) extends 
(uniquely) to a *-homomorphism from C*(G) onto Cx(G). 

(b) Let {fv }yey be an approximate identity for C.(G) (see Section 13.2). 
For V € VY, notice that | a || 12a) = 1, and consider the restriction 
wy of the vector state wp of B(L?(G)) to C*(G). Explain why the 
net {wy }yey consists of states of C(G). 

(c) Prove that if G is not discrete then wy aor 0 in the weak* topology 


of C*(G)*. (Hint: u(V) ee 0 when G is not discrete.) 
(d) Prove that if C*(G) is unital then G is discrete. 


(e) Conclude that C*(G) is unital iff C'(G) is unital iff G is discrete. 


Let G1, Gp be locally compact topological groups. Prove that there exists a 
canonical *-isomorphism from C*(G x G2) onto C*(G1) @max C* (G2). For 
technical reasons you may assume that G1,G2 are o-compact. The case 
that G1, G2 are discrete is technically easier, and it suffices for conveying 
the general case’s idea (cf. Example 13.2). 


Application I 


Probability 


I.1 Heuristics 


A fundamental concept in probability theory is that of an event. The “real world” 
content of the “event” plays no role in the mathematical analysis. What matters 
is only the event’s occurrence or non-occurrence. 

Two “extreme” events are the empty event @ (which cannot occur), and the 
sure event Q (which occurs always). 

To each event A, one associates the complementary event A°, which occurs 
iff A does not occur. 

If the occurrence of the event A forces the occurrence of the event B, one 
says that A implies B, and one writes A C B. One has trivially @ Cc A C Q for 
any event A. 

The events A,B are equivalent (notation: A = B) if they imply each other. 
Such events are identified. 

The intersection AN B of the events A and B occurs iff A and B both occur. 
If ANB=9 (ie., if A and B cannot occur together), one says that the events 
are mutually disjoint; for example, for any event A, the events A and A® are 
mutually disjoint. 

The union AUB of the events A, B is the event that occurs iff at least one of 
the events A, B occurs. The operations M and U are trivially commutative, and 
satisfy the following relations: 


AUAS=Q,; 
ANBCACAUB. 


One verifies that the algebra of events satisfies the usual associative and 
distributive laws for the family P(Q) of all subsets of a set 0, with standard 
operations between subsets, as well as the DeMorgan (dual) laws: 


(Us) <Q (Qa) =U 


410 Application I. Probability 


for any sequence of events {A;,}. Mathematically, we may then view the sure 
event 2 as a given set (called the sample space), and the set of all events (for a 
particular probability problem) as an algebra of subsets of 2. 

Since limiting processes are central in probability theory, countable unions of 
events should also be events. Therefore, in the set-theoretical model, the algebra 
of events is required to be a o-algebra A. 

The second fundamental concept of probability theory is that of a probability. 
Each event A € A is assigned a probability P(A) (also denoted PA), such that: 


(1) 0< P(A) <1 for all Ac A; 
(2) P(U Ax) = >> P(A,) for any sequence of mutually disjoint events A;; and 
(3) P(Q) =1. 

In other words, P is a “normalized” finite positive measure on the measurable 


space (Q,.A). The measure space (Q,.A, P) is called a probability space. Note that 
P(A°) =1—- P(A). 


Examples. 


(1) The trivial probability space (Q, A, P) has Q arbitrary, A = {0,0}, P(O) = 
0, and P(Q) = 1. 

(2) Discrete probability space. 0 is the union of finitely many mutually disjoint 
events A,,...,A,, with probabilities P(A,) = pp, pp > 0, and So pz, = 1. 
The family A consists of ( and all finite unions A = U,<; Ar, where 
Jc {1,...,n}. One lets P(@) = 0 and P(A) =o ,¢5 Dk: 

This probability space is the (finite) discrete probability space. When 
pr = p for all k (so that p = 1/n), one gets the (finite) uniform probability 
space. The formula for the probability reduces in this special case to 


where |A| denotes the number of points in the index set J (i-e., the number 
of “elementary events” A; contained in A). 


(3) Random sampling. A sample of size s from a population P of N > s 
objects is a subset S C P with s elements (|S| = s). The sampling is 
random if all @) samples of size s are assigned the same probability (i.e., 


the corresponding probability space is a uniform probability space, where 
Q is the set of all samples of given size s; this is actually the origin of the 
name “sample space” given to 2). The elementary event of getting any 


particular sample of size s has probability 1/ tea 


Suppose the population P is the disjoint union of m sub-populations 
(“layers”) P; of size N;(9>N; = N). The number of size s samples with 
s; objects from Pi(t = 1,...,m;}5s; = s) is the product of the binomial 
coefficients é a Therefore, if As,,.s,, denotes the event of getting s; objects 


i satin 


I.2. Probability space 411 


from P;(¢ = 1,...,m) in a random sampling of s objects from the multi-layered 
population P, then 


An ordered sample of size s is an ordered s-tuple (11,...,0%5) C P (we may 
think of x; as the object drawn at the ith drawing from the population). The 
number of such samples is clearly N(N — 1)---(N — s+ 1) (since there are N 
possible outcomes of the first drawing, N — 1 for the second, etc.). Fixing one 
specific object, let A denote the event of getting that object in some specific 
drawing. Since the procedure is equivalent to (ordered) sampling of size s — 1 
from a population of size N — 1, we have |A| = (N-—1)---[(N-—1)-(s—1)4+]], 
and therefore, for random sampling (the uniform model!), 


(N—1)--(N—s4+1) 


BUA) SOL yp) (N= ee) 2 


This probability is independent of the drawing considered! This fact is referred 
to as the “equivalence law of ordered sampling”. 


I.2. Probability space 


Let (Q, A, P) be a probability space, that is, a normalized finite positive measure 
space. Following our terminology, the “measurable sets” A € A are called the 
events; Q is the sure event; J is the empty event; the measure P is called 
the probability. One says almost surely (a.s.) instead of “almost everywhere” 
(or “with probability one”, since the complement of the exceptional set has 
probability one). 

If f is a real valued function on Q, it is desirable that the sets [f > c] be 
events for any real c, that is, that f be measurable. Such functions will be called 
real random variables (r.v.). Similarly, a complex r.v. is a complex measurable 
function on Q. 

The simplest r.v. is the indicator I4 of an event A € A. We clearly have 


teZa is (1) 
Tans = Lal; (2) 
Iaup =14+ 1p —Iasnp (3) 


for any events A, B, and 
TU, An = DTA (4) 
k 


for any sequence {A,} of mutually disjoint events. 
A finite linear combination of indicators is a “simple random variable”; L1(P) 


9? 


is the space of “integrable r.v’s” (real or complex, as needed); the integral over 


412 Application I. Probability 


Q of an integrable r.v. X is called its expectation, and is denoted by E(X) or 
EX: 


E(X) = [ xap, X €L(P). 


The functional E on L'(P) is linear, positive, bounded (with norm 1), and 
E1=1. For any AE A, 
E(L4) = P(A). 


For a simple r.v. X, #X is then the weighted arithmetical average of its values, 
with weights equal to the probabilities that X assume these values. 
The obvious relations 


P(A‘)=1-—P(A); | P(AUB)=PA+PB-—P(ANB), 


parallel (1) and (3); however, the probability analogue of (2), namely P(AN 
B) = P(A)P(B), is not true in general. One says that the events A and B are 
(stochastically) independent if 


P(ANB) = P(A)P(B). 


More generally, a family F C A of events is (stochastically) independent if 


P (n a] = |] Pts) 


ke J keJ 


for any finite subset {Ay;k € J} CF. 

The random variables X1,...,X,, are (stochastically) independent if for any 
choice of Borel sets B,,...,B, in R (or C), the events X;1(Bi),...,X71(Bn) 
are independent. 


Theorem I.2.1. If X1,...,Xn are (real) independent r.v.’s, and fi,..., fn are 
(real or complex) Borel functions on R, then fi(X1),..., fn(Xn) are independent 
r.v.’s. 


Proof. For simplicity of notation, we take n = 2 (the general case is analogous). 
Thus, X,Y are independent r.v_’s, and f,g are real (or complex) Borel functions 
on R. Let A, B be Borel subsets of R (or C). Then 


P(F(X) (A) g(¥)"(B)) = POX F(A Yo (BD) 
= P(X" (APO *Lg* (BY) = PU (X)*(A)) P(9(¥) (8). 


In particular, when X,Y are independent r.v.’s, the random variables aX +b 
and cY +d are independent for any constants a, b,c,d. For example, if X,Y are 
independent integrable r.v.’s, then X — EX and Y — EY are independent central 
(integrable) r.v’s, where “central” means “with expectation zero”. 


I.2. Probability space 413 


Theorem 1.2.2 (Multiplicativity of E on independent r.v’s). Jf 
X1,...,Xn are independent integrable r.v.’s, then [] Xz, ts integrable and 


B([]*) = J] 2%): 


Proof. The proof is by induction on n. It suffices therefore to prove the theorem 
for two independent integrable r.v’s X,Y. 
Case 1. Simple r.v.’s: 


Xa" ala, VS oye 
with all x; distinct, and all y, distinct. Thus A; = X7'({x;}) and B, = 
Y~1({yx}) are independent events. Hence 
E(XY)=E| So a;yIa,In, | = >_ ciyeE(ajn3,) 
jk 
= S > xj ye P(A; Br) = S> «5 Y¢P(A;)P(Bx) 
= Do 0jP(Aj) lye P(Br) = E(X)E(Y). 
j k 


Case 2. Non-negative (integrable) r.v.’s X,Y: 
For n = 1,2,..., let 


Xn = S- ee 


ao * Toa: 


For each n, X,, Y;, are independent, so that by Case 1, E(X;,Y,) = E(Xn)E(Y)). 
Since the non-decreasing sequences {X,}, {Yn}, and {X,Y,,} converge to X,Y, 
and XY, respectively, it follows from the Lebesgue monotone convergence 
theorem that 


E(XY) = lim E(XnY,) = lim E(Xn)E(Yn) = E(X)E(Y) 


(and in particular, XY is integrable). 


414 Application I Probability 
Case 3. X,Y real independent integrable r.v.’s: 
In this case, |X]|,|Y| are independent (by Theorem 1.2.1), and by Case 2, 
E(|XY|) = E(\X|)E(¥|) < 00, 


so that XY is integrable. Also by Theorem I.2.1, X’, Y’ are independent r.v’s, 
where the prime stands for either + or —. Therefore, by Case 2, 


E(XY) = E((X*+ — X-)(v+ —Y-)) 
= E(X*+)E(Y¥+) — E(X~)E(Y*) — E(X*)E(Y~) + E(X~)E(Y~) 
= [E(X*) — E(X~)[E(V*) — E(Y~)] = E(X)E(Y). 


The case of complex X,Y follows from Case 3 in a similar fashion. 


Definition I.2.3. If X is a real r.v., its characteristic function (ch.f.) is defined 
by 
fx(u) := E(e'™*) (we R). 
Clearly fx is a well-defined complex valued function, |fx| < 1, fx(0) = 1, 
and one verifies easily that it is uniformly continuous on R. 


Corollary 1.2.4. The ch.f. of the sum of independent real r.v.’s is the product 
of their ch.f.’s. 


Proof. If X1,...,X, are independent real r.v’s, it follows from Theorem I.2.1 
that e'@*1,...,¢e“*» are independent (complex) integrable r.v’s, and therefore, 
if X := 50 X, andue R, 


fx (u) = E("*) -= (TT) =] e"**) = T] fx) 
k k 


by Theorem I.2.2. 


1.2.1 L?-random variables 
Terminology I.2.5. If X € L?(P), Schwarz’s inequality shows that 
E(|X|) = E(1.|X|) < ||L]l2l|Xll2 = Xl, 
that is, X is integrable, and 
o(X) := ||X — EX||2 < co 


is called the standard deviation (s.d.) of X (this is the L?-distance from X to its 
expectation). The square of the s.d. is the variance of X. 

If X,Y are real L?-r.v’s, the product (X — EX)(Y — EY) is integrable (by 
Schwarz’s inequality). One defines the covariance of X and Y by 


cov(X,Y) := E((X — EX)(Y — EY)). 


I.2. Probability space 415 
In particular, 
cov(X,X) =07(X). 

By Schwarz’s inequality, 

|cov(X,Y)| < o(X)a(Y). 
The linearity of E implies that 

cov(X,Y) = E(XY)— E(xX)E(Y), (5) 

and in particular (for Y = X), 

o°(X) = E(X*) — (EX). (6) 


By (5), cov(X, Y) = 0 if X,Y are independent (cf. Theorem I.2.2). The converse 
is false in general, as can be seen by simple counter-examples. 


The L?-r.v’s X,Y are said to be uncorrelated if cov(X,Y) = 0. 
If X = I, and Y = Ip(A,B € A), then by (5), 


cov(I4,Ip) = E(Ialp) — E(I4)E(Ip) = P(ANB) - P(A)P(B). (7) 


Thus, indicators are uncorrelated iff they are independent! Taking B = A (with 
PA =p, so that P(A°) = 1-—p:= 4), we see from (6) that 


o?(I4) = Ea) — E(Ia)* =p —p? = pa. (8) 
Lemma I.2.6. Let X1,...,Xy be real L?-r.v.’s. Then 
o (x) = S07 (Xx) +2 x cov(X;, Xx). 
k k 1<j<k<n 


In particular, if Xj; are pairwise uncorrelated, then 
. (= x] = (Xn) 
k k 


(BienAyme’s identity). 
Proof. 


(Dx) = 2 (OX Dem) = 2(D Xe xa) 
es [es — EXx)? +25 °(X; — BX;)(Xe- EX) 
j<k 


=S 0 0?(Xx) +25 cov(X;, Xx). 


J<k 


416 Application I. Probability 


Example I.2.7. Let {A;,} Cc A be a sequence of pairwise independent events. 
Let 


SS as, WH TO 
k=1 


Then s - 
ES, = 5) PA; — 97(Sn) = 5 PAx(1— PAy). 
k=1 k=1 


In particular, when PA, = p for all k (the “Bernoulli case”), we have 
ES; = np; o°7(Sn) = npg. 


Note that for each w € 2, 5,,(w) is the number of events A; with k <n for which 
w € A; (“the number of successes in the first n trials”). 

For 0 <j <n,S,(w) = j iff there are precisely j events Ay, 1 < k <n, such 
that w € Ap (and w € Af for the remaining n — 7 events). Since there are (") 


possibilities to choose j indices k from the set {1,...,n} (for which w € Ax), 
and these choices define mutually disjoint events, we have in the Bernoulli case 


n pet 
PS, = j| = j Oo: () 


One calls S, the “Bernoulli random variable”, and (*) is its distribution. 


Example I.2.8. Consider random sampling from a two-layered population (see 
Section I.1, Example 3). Let B, be the event of getting an object from the layer 
P, in the kth drawing, and let D, = )>}_, Ip,. In our previous notations (with 
m= 2), 


where s; +59 =s and Nj + No=N. 
By the equivalence principle of ordered sampling (cf. Section I.1), PB, = 
N,/N for all k, and therefore 


Ni 

ED, = ) | PB = s5- 

Note that the events B; are dependent (drawing without return!). In the case of 

drawings with returns, the events By are independent, and D, is the Bernoulli 

r.v., for which we saw that ED, = sp = s(Ni/N) (since p := PB). Note that 

the expectation is the same in both cases (of drawings with or without returns). 
For 1<j <k<-s, one has 


N,N, -1 
P(BRN Bs) = 


I.2. Probability space 417 


by the equivalence principle of ordered sampling. Therefore, by (7), 


NMS of NN 
cov(Ip,,Ip;) = 7 Nod (x) 


independently of k,7. By Lemma I.2.6, 


_N-s Ni 1 Ny 
Mat oN NS) 


Thus, the difference between the variances for the methods of drawing with or 
without returns appears in the correcting factor (N — s)/(.N — 1), which is close 
to 1 when the sample size s is small relative to the population size N. 

One calls D, the hypergeometric random variable. 


Example I.2.9. Suppose we mark N > 1 objects with numbers 1,...,N. In 
drawings without returns from this population of objects, let A; denote the event 
of drawing precisely the kth object in the kth drawing (“matching” in the kth 
drawing). In this case, the r.v. 


N 
Saye ha 
k=1 


“is” the number of matchings in N drawings. 
By the equivalence principle of ordered sampling, PA, = 1/N and P(A; 
A;) = (1/N)(1/(N — 1)), independently of k and j < k. Hence, 


N 
ES= > PA, =1, 
k=1 


1 1 
cova tas) = ORI) NP 


and consequently, by Lemma I.2.6, 


; 1 1 N 1 1 
ae ee eee eee ey 
RD Ne Ae) ol pa aay, 2 


Lemma 1.2.10. Let X be any r.v. and € > 0. 
(1) If X € L?(P), then 


o%(X) 


PIIX-EX|> <7 


418 Application I Probability 


(Tchebichev’s inequality). 
(2) If |X| <1, then 
P||X| > ¢ > E(\X|?) —e’. 


(Kolmogorov’s inequality). 


Proof. Denote A = [|X — EX| > €]. Since |X — EX|? > |X — EX|?I4 > e? La, 
the monotonicity of EF implies that 


o?(X) := E(|X — EX|?) > PE(I4) = P P(A), 


and Part (1) is verified. 
In case |X| < 1, denote A = [|X| > ¢]. Then 


|X|? = |X|?Ia + |X/Iac Slate’, 


and Part (2) follows by applying EF. 


Corollary 1.2.11. Let X be an integrable r.v. with |X —EX| <1. Then for any 
e>0, 


o%(X). 


o*(X)—e < P[||X — EX| >] < —3 
€ 


(Note that X is bounded, hence in L?(P), so that we may apply Part (1) to X 
and Part (2) to X — EX.) 


Corollary 1.2.12. Let {A,} be a sequence of events, and let X, = 
(1/n) o7_, La, (the occurrence frequency of the first n events). Then X,—EXy, 
converge to zero in probability (i.e. P||X, — EX,| > «] > 0 asn— ~, for any 
€ > 0) if and only if 0?(Xn) > 0. 


Proof. Since 0 < I4,, PA, < 1, we clearly have |I4, — PA,| < 1, and therefore 


1 n 
Xy — BXy| < - Ia, — PAx| <1. 
| Sq Dalia - Pads 


The result follows then by applying Corollary [.2.11. 0 


Example 1.2.13. Suppose the events A; are pairwise independent and PA; = p 
for all k. By Example 1.2.7, EX, = pand o?(X,) = pq/n — 0, and consequently, 
by Corollary [.2.12, X, — p in probability. This is the Bernoulli Law of Large 
Numbers (the “success frequencies” converge in probability to the probability of 
success when the number of trials tends to oo). 


Example I.2.14. Let {X;,} be a sequence of pairwise uncorrelated L?-random 
variables, with 
EX,p=p, o(Xp) =o (k= T5232.-2) 


I.2. Probability space 419 


(e.g., X, is the outcome of the kth random drawing from an infinite population, 
or from a finite population with returns). Let 


(the “sample mean” for a sample of size n ae Then, by Lemma I.2.6, 
E(M,) = p; o?(M,) = 07 /n. 


By Lemma I.2.10, 
2 


P(|M, — ul >< 5 
ne 


for any « > 0, and therefore M, — yw in probability (when n — oo). This 
is the so-called weak law of large numbers (the sample means converge to the 
expectation js when the sample size tends to infinity). The special case X; = I, 
for pairwise independent events A; with PA; = p is precisely the Bernoulli law 
of large numbers of Example I.2.13. 

The Bernoulli law is generalized to dependent events in the next section. 


Theorem I.2.15 (Generalized Bernoulli law of large numbers). Let {A;,} 
be a sequence of events. Set 


1 n 
pi(n) = . S- PAx, 
k=1 


S 
NO 
3 

wn” 
I 


1<j<k<n 


1 
Tax Dy PlARN A) 
() . 


and 
dy, *= p2(n) — pi(n). 
Let X,, be as in Corollary I.2.12 [the occurrence frequency of the first n events]. 
Then Xn, —- EXp — 0 in probability if and only if dy, — 0. 


Proof. By Lemma I.2.6 and relations (7) and (8) preceding it, we obtain by a 
straightforward calculation 


o?(Xp) = = S [PAx — (PAx)?] + = 3) [P(An Ay) — PAx: PAY 
k=1 1<j<k<n 
=d, + (1/n)[pi(m) — pa(n)}. (9) 
Therefore 


|o?(Xn) — dnl < (1/n)|pi (nr) — p2(n)]. 
However, p;(m) are arithmetical means of numbers in the interval [0,1], hence 
they belong to [0,1]; therefore |p; — p2| < 1 and so 
1 


Jo(Xn) — dal < = (10) 


420 Application I. Probability 


In particular, o?(X;,) — 0 iff d, — 0, and the theorem follows then from 
Corollary 1.2.12. 


Remark I.2.16. Note that when the events A; are pairwise independent, 


ldnl = |—* > P(Ag)P(Ay) — yo PAW) PCAs) ~ PUA? 


n(n — 1) 1<j<k<n j<k k=1 


= (1/n) ( pare (A, )P Aj) = 230 P(Aw? 


j<k k=1 


Both arithmetical means between the absolute value signs are means of numbers 
in [0,1], and are therefore in [0,1]. The distance between them is thus <1, hence 
|dn| < 1/n, and the condition of the theorem is satisfied. Hence, X, — EX, 
converge in probability to zero, even without the assumption PA; = p (for all k) 
of the Bernoulli case. 

We consider next the stronger property of almost sure convergence to zero of 
Xn — EX. 


Theorem I.2.17 (Borel’s strong law of large numbers). With notations 
as in Theorem 1.2.15, suppose that d, = O(1/n). Then X,— EX, converge to 
zero almost surely. 


This happens in particular when the events A; are pairwise independent, 
hence in the Bernoulli case. 


Proof. We first prove the following. 


Lemma. Let {X,,} be any sequence of r.v.’s such that 
S> Pl|Xn| = 1/m] < 00 


for allm =1,2,.... 
Then Xp, — 0 almost surely. 


Proof of lemma. Observe that by definition of convergence to 0, 
[Xn 3 0] =( VU Xn4el < 1/ml, 
monk 
where all indices run from 1 to oo. By DeMorgan’s laws, we then have 
[Xn > 0]° = (JOU Xn4e] > 1/ml. (11) 
mn k 


Denote the “innermost” union in (11) by Bnm, and let Bm :=(),, Bum. We have 
(by the o-subadditivity of P): 


PBnm < Y_PUlXn+el > 1/m) = S> Pl|Xp| > 1/m] noo 0 
k r=n+1 


I.2. Probability space 421 


by the lemma’s hypothesis, for all m. Therefore, since PB, < PBnm for all n, 
we have PB,, = 0 (for all m), and consequently 


P([Xn > O]° P(UBm) 


Back to the proof of the theorem, recall (10) from Section 1.2.15: 
|o?(Xn) — dn| < 1/n. 


Since |d;,| < c/n by hypothesis, we have o7(X;) < (c+ 1)/n. By Tchebichev’s 
inequality, 


S > Pl|Xg2 — EXp2| > 1/m] < m? S/o? (Xq2) 
k 


(c+ 1)m? S(1/k?) < 
k 


By the lemma, we then have almost surely 
Xy2 — EX)2 > 0. 
For each n € N, let & be the unique k € N such that 
ken <(k+1). 


Necessarily n — k? < 2k and k > oo when n —+ 00. We have 


2 
i Seles a di sy I 
n k? =. n ke2 = Aj n A; 


Hence, also 
|EX,2 — EX,| = |E(Xp2 — Xn)| < 4/k, 


and therefore, 
[Xn — EX,| < |Xpn — Xp2| + |Xy2 — EXy2| + |EXp2 — EX)| 


almost surely, when n — oo. 


J.2.18. Let {A,} be a sequence of events. Recall the notation 


limsup A, := q U Aj. 


k=1j=k 


422 Application I Probability 
This event occurs iff for each k € N, there exists 7 > k such that A; occurs, that 
is, iff infinitely many Ay occur. 


Lemma I.2.19 (The Borel—Cantelli lemma). /f 


7 P(An) < 00, (") 


then 
P(lim sup A,,) = 0. 


Proof. ke = 
P(limsup An) < P( U A)) <> P(A;) 
j=k 


jek 


for all k, and the conclusion follows from (*) by letting k — oo. 


Example I.2.20 (the mouse problem). Consider a row of three connected 
chambers, denoted L (left), M (middle), and R (right). Chamber R has also a 
right exit to “freedom” (£’). Chamber L has also a left exit to a “death” trap D. 
A mouse, located originally in 1M, moves to a chamber to its right (left) with a 
fixed probability p (¢ := 1—p). The moves between chambers are independent. 
Thus, after 2' moves, we have 


P(Ai)=p*; P(Di)=@; P(M1) = 2pq, 


where F\,D,,M, denote, respectively, the events that the mouse reaches F, D, 
or M after precisely 2' moves. In general, let Mj; denote the event that the 
mouse reaches back M (for the first time) after precisely 2* moves. Clearly 


P(Mx) = (2pq)". 
Since }> P(M;) < co, the Borel—Cantelli lemma implies that 
P(limsup M,,) = 0, 


that is, with probability 1, there exists k € NU {0} such that the mouse moves 
either to F or to D at its 2*+1-th move. The probability of these events is, 
(2pq)*p? and (2pq)*q?, respectively. Denoting also by F (or D) the event that 
the mouse reaches freedom (or death) in some move, then F is the disjoint union 
of the Fj, (and similarly for D). Thus 


2 


PF= 2pq)*p? = 
mi pq) P pone 


and similarly 


I.2. Probability space 


423 


This is coherent with the preceding observation (that with probability 1, either 


F or D occurs), since the sum of these two probabilities is clearly 1. 


The case of events with 5> P(A,) = oo is considered in Theorem 1.2.21. 


Notation is as in Theorem I[.2.15. 


Theorem I.2.21 (Erdos—Renyi). Let {A,} be a sequence of events such that 


So Pn) =e 
and 
lim inf pa(n) — 
py(n) 
Then 


P(limsup A,,) = 1. 
Proof. Let X, = (1/n) oy_, Ia,. Then 
EX, = p(n) 


and 
o°(Xn) = pa(n) — pi(n) + (1/n)[pi(n) — pa(n)] 
(cf. (9) in Theorem I.2.15). By Tchebichev’s inequality, 


a?(Xn) 
Pl|Xn — pi(n)| 2 pi(n)/2] < [pi (n) /2)2 
7 1 \ p2(n) __1 
=4 : x) Bae) sy np, (n) 
By (12), 
npi(n) = PA, > ©. 


k= 
Hence by (13), the lim inf of the right-hand side is 0. Thus 


an 


lim inf P[|X, — pi(n)| > pi(n)/2] = 0. 
Clearly 
[Xn < pi(n)/2] C [|Xn — pi(n)| = pi(n)/2], 


and therefore 
lim inf PLX, < pi(n)/2] = 0. 


We may then choose a sequence of integers 1 < ny < ng <--- such that 


S> P[Xn, <pi(ne)/2] < co. 
k 


By the Borel—Cantelli lemma, 


P(limsup[Xp, < pi(nx)/2]) = 0, 


(12) 


(13) 


(14) 


(15) 


(16) 


424 Application I Probability 


that is, with probability 1, X,, <pi(nz)/2 for only finitely many ks. Thus, with 
probability 1, there exists kg such that X,, > pi(n~)/2 for all k > ko, that is, 


Nk 


> Ia, = nepi(n)/2 


j=l 


for all k > ko, and since the right-hand side diverges to co by (14), we have 


co 
oy Ta, = 00 
j=l 


with probability 1, that is, infinitely many A;s occur with probability 1. 


Corollary 1.2.22. Let {A,,} be a sequence of pairwise independent events such 


that 
S- PA, =o. 


P(lim sup A,) = 1. 
Proof. We show that (13) of the Erdos—Renyi theorem is satisfied. We have 


Then 


9 | P2ln) = So P(ARN Aj) = (1/2) $2 P(Aj) (An) 
1<j<k<n j#RA<j,k<n 
= a/2)| S > P(A;) P(A) — 52 Plan) | 
j,k=1 k=1 
= (1/2) 2p) = S PAL) | | 
k=1 
Therefore 


pon) nok PUR)? 
pi(m) — m—1_— n(n— I)pz(n)” 

However, since PA; < 1, the sum shown is <}7/_, PAx := npi(n). Therefore, 
the second (non-negative) term on the right-hand side is <1/[(n—1)p1(n)] — 0 by 
(14) (consequence of the divergence hypothesis). Hence, lim p2(n)/p7(n) = 1. 


Corollary 1.2.23 (Zero-one law). Let {Ay} be a sequence of pairwise 
independent events. Then the event limsup A, (that infinitely many Ay,’s occur) 
has probability 0 or 1, according to whether the series )* PA, converges or 
diverges (respectively). 


I.3 Probability distributions 


Let (Q, A, P) bea probability space, let X be a real- (or complex-, or R”-) valued 
random variable on 2, and let B denote the Borel o-algebra of the range space 
(R, etc.). We set 


I.3. Probability distributions 425 


Px(B) := P[X € BJ} = P(X7'(B)) (BEB). 
The set function Px is called the distribution of X. 
Theorem I.3.1 (stated for the case of a real r.v.). 
(1) (R, B, Px) is a probability space. 


(2) For any finite Borel function g on R, the distribution of the r.v. g(X) is 
given by 
Py(x)(B) = Px(g""(B)) (B€ B). 
(3) If the Borel function g is integrable with respect to Px, then 


E(9(X)) = ip aR, 


Proof. 


(1) Clearly, 0 < Px <1, and Px(R) = P(X7!(R)) = P(Q) =1. If {By} cB 
is a sequence of mutually disjoint sets, then the sets X~1(B;) are mutually 
disjoint sets in A, and therefore 


Cs em Oa) 
= e P(X~(Bx)) = S— Px(Br). 


(2) We have for all B € B: 


Py(x)(B) := P(g(X)~*(B)) = P(X“ (g-"(B))) = Px(g""(B)). 


(3) If g = Ig for some B € B, then g(X) = Ix-1(g), and therefore 
Blg(X)) = PUX1(B)) = Px(B) = f ods. 


By linearity, Statement (3) is then valid for simple Borel functions g. If g 
is a non-negative Borel function, there exists an increasing sequence of non- 
negative simple Borel functions converging pointwise to g, and (3) follows from 
the monotone convergence theorem (applied to the measures P and Px). If g is 
any (real) Borel function in L1(Px), then, by the preceding case, E(|g(X)|) = 
frlgldPx < co, that is, g(X) € L1(P), and E(g(X)) = E(g*(X) — g-(X)) = 
: a gt dPx = ‘fs g dPx = ag dPx. 

The routine extension to complex g is omitted. 


Definition I.3.2. The distribution function of the real r.v. X is the function 
Fx (x) := Px((—oo,2)) = P[X <a] (a e€R). 
The integral [, gdPx is also denoted f, gdFx. 


Proposition 1.3.3. Fx is a non-decreasing, left-continuous function with range 
in [0,1|, such that 
Fy (—oo) = 0; Fy(co) = 1. (*) 


426 Application I Probability 


Proof. Exercise. 


Definition I.3.4. Any function F' with the properties listed in Proposition I.3.3 
is called a distribution function. If Property (*) is omitted, the function F' is 
called a quasi-distribution function. 


Any (quasi-) distribution function induces a unique finite positive Lebesgue— 
Stieltjes measure (it is a probability measure on R if (*) is satisfied), and 
integration with respect to that measure is denoted by les gdF. In case g is 
a bounded continuous function on R, this integral coincides with the (improper) 
Riemann-Stieltjes integral f° g(x) dF (2). 

The characteristic function of the (quasi-) distribution function F' is defined 
by 

flu) := | el“ dF(x) (2 €R). 
R 
By Theorem [.3.1, if F = Fy for a real r.v. X, then f coincides with the ch-f. 
fx of Definition 1.2.3. 

In general, the ch.f. f is a uniformly continuous function on R,|f| < 1, and 

f(0) = 1 in case F satisfies (*). 


Proposition [.3.5. Let X be a real r.v.,b>0, anda€éR. Then 
farox(u) =e“ fx(bu) (u ER). 
Proof. Write Y =a+ bX and y=a-+ ba (a ER). Since b > 0, 
Fx (a) = P[X <a] = PIY <y] = Fy(y), 


and therefore, 


fy(u) _ ie elvy dFy (y) = i eiv(at+ba) dFx (x) = el fx (bu). 


We consider r.v_’s of class L’(P) for r > 0. The L’-“norm”, denoted ||X'||,, 
satisfies (by Theorem I.3.1) 


IXIp = B(XI) =f lalarx(a), 
This expression is called the rth absolute central moment of X (or of Fx). It 
always exists, but could be infinite (unless X € L"). 
The rth central moment of X (or Fx) is 


CO 


my = E(X") = x" dFx (2). 


—oCo 


These concepts are used with any quasi-distribution function F’, whenever they 
make sense. 


I.3. Probability distributions 427 


Lemma I.3.6. 
(1) The function ¢(r) := log E(|X|") is convex on (0,00), and ¢(0) = 0. 


(2) \|X||, ts a non-decreasing function of r. In particular, if X is of class L” 
for some r > 0, then it is of class L® for allO<s<r. 


Proof. For any r.v’s Y, Z, Schwarz’s inequality gives 
E(|YZ]) < ||¥|l2l| Zllo- 

For r > s > 0, choose Y = |X|("-5)/? and Z = |X|"+)/2, Then 

E(\X|") < [E(X/ B(x ]?, 
so that 

o(r) < (1/2)[o(r — 8) + or + 8), 
and (1) follows. 
The slope of the chord joining the points (0,0) and (r, é(r)) on the graph of 


¢ is o(r)/r, and it increases with r, by convexity of ¢. Therefore || X||,, = e%"/" 
increases with r. 


Theorem [.3.7. Let X be an L"-r.v. for some r > 1, and let f = fx. Then for 
all integers 1 <k <r, the derivative f\ exists, is uniformly continuous and 
bounded by E(|X|*)(< co), and is given by 


fu) = [eM akar(x), (* 
(where F := Fx ). In particular, the moment mx exists and is given by 


mz = f (0) /i* (k=1,...,[r]). 


Proof. By Lemma 1.3.6, E(|X|*) < co for k < r, and therefore the integral 
in (*) converges absolutely and defines a continuous function g,(u). Also, by 


Fubini’s theorem, 
t co t . 
| (udu = [ i: ie!" dua" dF (a) 
0) —oo JO 


= gr—1(t) — gx—1(0). 


Assuming (*) for k—1, the last expression is equal to f(*—))(t) — f-)) (0). Since 
the left-hand side is differentiable, with derivative g;(t), it follows that f) exists 
and equals gz. Since (*) reduces to the definition of f for k = 0, Relation (*) for 
general k < r follows by induction. 


Corollary 1.3.8. If the r.v. X is in L* for all k = 1,2,..., and if f := fx is 
analytic in some real interval |u| < R, then 


f(u) = 5 i®mgu*/k! (\ul < R). 
k=0 


428 Application I Probability 


Example I.3.9 (discrete r.v.). Let X bea discrete real r.v., that is, its range 
is the set {x,}, with x, € R distinct, and let PLX = x] = pp (> pp = 1). Then 


fx(u) = do e™**p,,. (1) 
k 


For the Bernoulli r.v. with parameters n,p, we have x, =k (0<k <n) and 


Pr = (2)pkar-* (q:=1-p). A short calculation starting from (1) gives 


fx(u) = (pe™ +g)”. 
By Theorem 1.3.7, 


my = fx(O)/i=np, 
ms = fx(0)/i? = (np)? + npa, 


and therefore, 
o°?(X) = m2 — m2? = npq 

(cf. Example 1.2.7). 

The Poisson r.v. X with parameter X > 0 assumes exclusively the values k 
(k = 0,1,2,...) with 

P[X =k] =e"*)* /kl. 
Then Fx (x) = 0 for  < 0, and =e >, ., A*/k! for x > 0. Clearly, Fx (oo) = 
1, and by (1), 
fx(u) =e S- eth DF (kl = ee eh, 
k 


Note that E(|X|") =e7* >, k?\*/k! < co for all n, and Corollary 1.3.8 implies 
then that m, = f)(0)/i* for all k. Thus, for example, one calculates that 


m, =A, mz = (A +1), 


and therefore 
oixyaX 


(This can be reached of course directly from the definitions.) 
The Poisson distribution is the limit of Bernoulli distributions in the following 
sense. 


Proposition 1.3.10. Let {Agn;k = 0,...,n3n = 1,2,...} be a “triangular 
array” of events such that, for each n = 1,2,..., {Annik = 0,...,n} is a 
Bernoulli system with parameter pp = A/n (A > 0 fixed) (cf. Example 1.2.7). Let 
Dee ae T4,,,, be the Bernoulli r.v. corresponding to the nth system. Then the 
distribution of X, converges pointwise to the Poisson distribution when n > co: 


P[Xn k] n> 00 o> A* /k! (k = 0, L228) 


I.3. Probability distributions 429 


Proof. 


me) 
i 
] 

ca 
T 


1, | A/n)R(L = X/nyn* 
ce ee el 


n> 00 7s 


Example 1.3.11 (Distributions with density). If there exists an L'(dx)- 
function h such that the distribution function F has the form 


F()= h(t)dt (x ER), 


then h is uniquely determined a.e. (with respect to Lebesgue measure dx on R), 
and one has a.e. F’(x) = h(x). In particular, h > 0 a.e., and since it is only 
determined a.e., one assumes that h > 0 everywhere, and one calls h (or F’) the 
density of F. For any g € L'(F), 


[oar | oP ae (B € B). 
B B 


We consider a few common densities. 


1.3.12. The normal density. 
The “standard” normal (or Gaussian) density is 


F'(x) = (2m)7*/2e-®/2, 


To verify that the corresponding F' is a distribution function, we need only to 
show that F'(oo) = 1. We write 


20 co 
F(co)? = (1/2n) a eo (P+8")/2 dt dg = (1/2n) | i: er dr db =1, 


where we used polar coordinates. The ch.f. is 


Flu) = (2my-¥? f 


eit e—2°/2 do — (ony? f e (ein)? +u71/2 ap 
R 


R 


where c(u) = (2m)~1/? f, e~ (#~iu)*/2 da, The Cauchy integral theorem is applied 
to the entire function e~*’/ 2 with the rectangular path having vertices at 
—M, N, N — iu, and —M — iu; letting then M,N — co, one sees that 


430 Application I. Probability 


c(u) = (2n)-'/? fee eo? 2 de = F(oo) = 1. Thus the ch.f. of the standard 
normal distribution is e~“’/2, 

If the r.v. X has the standard normal distribution, and Y = a+ bX with 
a € Rand b > 0, then for y = a+ ba, we have (through the change of variable 


s=a-+ dt): 


2 y 2 2 
e? /? dt = conb2)-¥? f @ eer i2h ag. 


—oo 


Fy(y) = F(a) = 2m"? [ 
This distribution is called the normal (or Gaussian) distribution with parameters 
a, b? (or briefly, the N(a, b?) distribution). By Proposition 1.3.5, its ch-f. is 


fy (u) a ita (bu)? /2 


Since E(|Y|") < oo for all n, Corollary I.3.8 applies. In particular, one calculates 
that 
m=fyO)/iza m=-fy0)=0 +P, 
so that 
o*(Y) = m2 —m? =’. 
Thus, the parameters of the N(a,b?) distribution are its expectation and its 
variance. 
For the N(0,1) distribution, we write the power series for f(u) = e 
and deduce from Corollary 1.3.8 that m2;41 = 0 and ma; = (27)!/(2/j!), 7 = 
0; 1j 2,003 


Proposition. The sum of independent normally distributed r.v.’s is normally 
distributed. 


Proof. Let X; be N(ax, 7) distributed independent r.v’s (k = 1,...,n), and 
let X = X1+---+X,. By Corollary 1.2.4, 


fx(u) =|] fx.) = [Jette ? = citte bru? /2 
k k 


with a = >>, a, and b? = 5°, b?. By the uniqueness theorem for ch.f’s (see 
Theorem 1.4.2), X is N(a,b?) distributed. 


1.3.13. The Laplace density. 
F'(x) = (1/2b)e7!*- 41/2, 
where a € R and b > 0 are its “parameters”. One calculates that 
_eiua 
14+ b2u2" 


In particular, F(oo) = f(0) = 1, so that F is indeed a distribution function. One 
verifies that 


fu) 


m= f'(O)/i=a; ma= f"(0)/i? =a? +0’, 


I.3. Probability distributions 431 


so that 0? = b?. 


1.3.14. The Cauchy density. 


b 
F'(x) = (1/7) 
=e pear 
where a € R and b > 0 are its “parameters”. To calculate its ch.f., one uses the 
residues theorem with positively oriented rectangles in Rz > 0 and in Rz < 0 
for u > 0 and u < 0, respectively, for the function e'“*/(b? + (z — a)”). One gets 


f(u) = eitae—blul | (2) 


In particular, f(0) = 1, so that F(oo) = 1 as needed. 

Note that f is not differentiable at 0. Also m, do not exist for k > 1. 

As in the case of normal r.v_’s, it follows from (2) that the sum of independent 
Cauchy-distributed r.v.’s is Cauchy distributed, with parameters equal to the 
sum of the corresponding parameters. This property is not true however for 
Laplace-distributed r.v’s. 


1.3.15. The Gamma density. 


and F(a) = 0 for x < 0. The “parameters” b,p are positive. The special case 
p =1 gives the exponential distribution density. 
The function F is trivially non-decreasing, continuous, F'(—oo) = 0, and 


P(co) = L(y)? fe (be)? abe) = 1, 
0 
so that F is indeed a distribution function. We have 


bP i’ tans 
u) == xP~“e m1)” da 
I= Tey J 


\P 


——~ ‘7 Magia te —iu)az. 
Foy f, Wal a(b in) 


By Cauchy’s integral theorem, integration along the ray {(b — iu)a;a > 0} can 
be replaced by integration along the ray [0, 00), and therefore the integral equals 
T(p), and 


f(u) = (1 — iu/b)?. (3) 
Thus 
f (u) = p(p +1)-+- (p+ k= 1)(i/b)* (1 = iu/b)- 


and 
mr = f (0) /i®¥ = p(pt+1)---(p+k—1)b-*. 


In particular, 


my, =p/b, m2 =p(pt1)/0*, 0% =m2—mj=p/d’. 


432 Application I Probability 


As in the case of the normal distribution, (3) implies the following. 


Proposition 1. The sum of independent Gamma-distributed r.v.’s with 
parameters b,py,k =1,...,n, is Gamma-distributed with parameters b, > pr. 


Note the special case of the exponential distribution (p = 1): 
f(u) =(1—iu/b)"; m, =1/b; o? =1/0?. 


Another important special case has p = b = 1/2. The Gamma distribution with 
these parameters is called the standard y? distribution with one degree of freedom. 
Its density equals 0 for 2 < 0, and since (1/2) = 1!/?, 


F' (x) _ (Qn) gp Net? 


for x > 0. 

By Proposition 1, the sum of n independent random variables with the 
standard y?-distribution has the standard x?-distribution with n degrees of 
freedom, that is, the Gamma distribution with p = n/2 and b = 1/2. Its density 
for x > 0 is 1 

/ — nn /2-1,-2/2 
F'(#) = SPE (n/D* e : 
Note that m1 = p/b =n and o? = p/b? = 2n. 

The x? distribution arises naturally as follows. Suppose X is a real r.v. with 
continuous distribution Fy, and let Y = X?. Then Fy(x) := P[Y < a] = 0 for 
x <0, and for x > 0, 


Fy (2) = Pl-a? < X < a/?] = Fx (a?) — Fx(-2/”). 


If F& exists and is continuous on R, then 


1 
Fy (@) = sapu Pa (0"?) + Fa (-2¥?) 


for x > 0 (and trivially 0 for x < 0). In particular, if X is N(0, 1)-distributed, 
then Y = X? has the density (27)~!/?a-1/2e-*/? for « > 0 (and 0 for 
x <0), which is precisely the standard y? density for one degree of freedom. 
Consequently, we have the following. 


Proposition 2. Let X,,...,X, be N(0,1)-distributed independent r.v.’s and let 


Then F,2 is the standard x? distribution with n degrees of freedom (denoted 
PFys 7). 
XA snr 


If we start with N(,07) independent r.v’s X;, the standardized r.v.’s 


_ Aka 
Oo 


Zh: 


4. Characteristic functions 433 


are independent N(0, 1) variables, and therefore the sum V := )>/_, Zj has the 
F\2 » distribution. Hence, if we let 
x2 = (Xe - 0)’, 
k=1 
then, for « > 0, 
P2(o) = PV ale |oF e3(@/o-): 
This is clearly the Gamma distribution with parameters p = n/2 and b = 1/207. 
In particular, we have then m, = p/b = no? (not surprisingly!), and o?(x2) = 
p/b? = 2no*. 


1.4 Characteristic functions 
Let F be a distribution function on R. Its normalization is the distribution 
function 

F* (a) := (1/2)[F(a — 0) + F(x + 0)] = (1/2)[F (x) + F(a + 0)] 


(since F' is left-continuous). Of course, F*(x) = F(a) at all continuity points x 
of F. 


Theorem 1.4.1 (The inversion theorem). Let f be the ch.f. of the 
distribution function F. Then 


U -iua —iu 
F*(b) — F*(a) = lim ~ | eS 


Uoo 27 =iF 1U 
for -w<a<b<ow. 


Proof. Let Jy denote the integral on the right-hand side. By Fubini’s theorem, 


U eviua _ e-iub ro | 
em (12m) f ——— | ei” dF (2) du 


—U —oo 
2 ip Ku(x) dF(a), 


where 


U(a—a) ©: 
= Ci ae) i ae: 


U(e—b) t 


The convergence of the Dirichlet integral [°° (sin t/t)dt (to the value m) implies 
that |Ky(a)| < M < oo for all € Rand U > 0, and as U > ~w, 


Kyu (2) > $(@) := Ia) (@) + (1/2) [Laz + Lyn} ] (2) 


434 Application I Probability 


pointwise. Therefore, by dominated convergence, 


jim Jo = [ oar = FW) - F(a, 


Theorem 1.4.2 (The uniqueness theorem). A distribution function is 
uniquely determined by its ch.f. 


Proof. Let F,G be distribution functions with ch.f’s f,g, and suppose that 
f =. By the inversion theorem, 


F*(b) — F*(a) = G*(b) — G*(a) 


for all real a < b. Letting a + —oo, we get F* = G"*, and therefore F = G at all 
points where F,G are both continuous. Since these points are dense on R and 
F,G are left-continuous, it follows that F = G. 


Definition 1.4.3. Let Cr denote the set of all continuity points of the quasi- 
distribution function F' (its complement in R is finite or countable). A sequence 
{F,,} of quasi-distribution functions converges weakly to F if F(a) > F(a) for 
all x € Cp. One writes then F,, >, F. In case F,,, F are distributions of r.v.’s 
Xp, X respectively, we also write X, 7 wX. 


Lemma I.4.4 (Helly—Bray). Let F,,, F be quasi-distribution functions, Fy, >w 
F’, and suppose a < b are such that F,,(a) > F(a) and F,,(b) > F(b). Then 


b b 
[sar | gdF 


for any continuous function g on [a,b]. 


Proof. Consider partitions 
A=Lm1 <*'' < Lm ky4+1 = 5, tm,j €E Cr, 


such that 


Om = SUP(Tm,j+1 — Lm,j) > 0 
j 


as m— oo. Let 


kim 
Im = s G(2m.§ Lem, m,5-41)° 
j=l 
Then 
sup |g — Gl “Pm-re 0. (1) 
a,b 


The hypothesis implies that 


Fr(2m,j+1) — Fn(@m,3) 4 F(@m,j41) — F (mj) 


4. Characteristic functions 435 


when n > ov, for all j =1,...,4m; m=1,2,... Therefore 
b km 
J 9m AF = ¥° 9l2m.) Fal.) — Faltmg)] 
a j=l 
kim 
Gases Ung Wa Fag) 
j=l 
b 
= | omaF (m= Ty 2 a3) 3 (2) 
Write 


b b 
| gdF, ~ | ydF 


b b b b 
| Ja — aml dF + | f gndky — [ im aF| + | lgm — g| dF 


b b 
< 2euplo onl +] [ gn dF — [ im dF | 
[a,b] a a 


If « > 0, we may fix m such that supjay) 19 — Im| < €/4 (by (1)); for this m, 
it follows from (2) that there exists no such that the second summand shown is 
<e/2 for all n > m9. Hence 


b b 
| [ sar.- f carl <e (n > no). 


We consider next integration over R. 


Theorem 1.4.5 (Helly—Bray). Let F,,, F be quasi-distribution functions such 
that F, >» F. Then for every g € Co(R) (the continuous functions vanishing 


at co), 
[oars > [ oar. 
R R 


In case F,,F are distribution functions, the conclusion is valid for all 
g € C,(R) (the bounded continuous functions on R). 


Proof. Let « > 0. For a < bin Cr and g € C;(R), write 


b b 
| [oar - f ar| < | lara + F)+| | gdFy— | gat 
R R [a,b]° a a 


In case of quasi-distribution functions and g € Co, we may choose [a,b] such 
that |g| < €/4 on [a, |‘; the first term on the right-hand side is then <e/2 for all 


436 Application I Probability 


n. The second term is <e/2 for all n > mo, by Lemma I.4.4, and the conclusion 
follows. 

In the case of distribution functions and g € Cy, let M = supp |g|. Then the 
first term on the right-hand side is 


< M[F, (a) +1— F,(b) + F(a) +1— F(b)]. 


Letting n — oo, we have by Lemma I.4.4 


timsup| f gar, — f gaF| < 2Mtr(a)+1- FO), 
R R 


for any a < bin Cp. The right-hand side is arbitrarily small, since F(—oo) = 0 
and F'(oo) = 1. 


Corollary 1.4.6. Let F,,, F be distribution functions such that Fy, >y F, and 
let fn, f be their respective ch.f.’s. Then fn — f pointwise on R. 


In order to prove a converse to this corollary, we need the following: 


Lemma I.4.7 (Helly). Every sequence of quasi-distribution functions contains 
a subsequence converging weakly to a quasi-distribution function (“weak 
sequential compactness of the space of quasi-distribution functions”). 


Proof. Let {F;,} be a sequence of quasi-distribution functions, and let Q = {x} 
be the sequence of all rational points on R. 

Since {F,(a1)} C [0,1], the Bolzano—Weierstrass theorem asserts the 
existence of a convergent subsequence {Fy1(a1)}. Again, {Fri(a2)} Cc [0,1], 
and has therefore a convergent subsequence {Fy,2(x2)}, etc. Inductively, we 
obtain subsequences {Fx} such that the kth subsequence is a subsequence 
of the (k — 1)th subsequence, and converges at the points x,,...,x7,. The 
diagonal subsequence {Fn} converges therefore at all the rational points. Let 
Fo := lim, Fnn, defined pointwise on Q. For arbitrary x € R, define 


F(a) := oP, Foe(r). 


Clearly, F is non-decreasing, has range in [0,1], and coincides with Fg on Q. Its 
left-continuity is verified as follows: given € > 0, there exists a rational r < x 
(for « € R given) such that Fo(r) > F(x) —«. If t € (7,2), 


F(x) > F(t) > F(r) = Fo(r) > F(a) -€«, 


so that 0 < F(x) — F(t) <.e. 
Thus F is a quasi-distribution function. 
Given x € Cp, if r,s € Q satisfy r <x < s, then 


Fiun(r) < Fan(x) < Fan(s). 
Therefore 


F(r) = Fo(r) < liminf Fy,»(x) < limsup Frn(x) < Fo(s) = F(s). 


I.4. Characteristic functions 437 


Hence, 


F(a) := ae F(r) < liminf Fyn(x) < limsup Frn(x) < F(s), 


and since x € Cp, letting s > x+, we conclude that Fy, (a) > F(x). 


Theorem I.4.8 (Paul Levy continuity theorem). Let F,, be distribution 
functions such that their ch.f.’s fy converge pointwise to a function g continuous 
at zero. Then there exists a distribution function F (with ch.f. f) such that 
Fw F and f =g. 


Proof. Since |f,| < 1 (ch.f’s!) and fn, — g pointwise, it follows by dominated 
convergence that 


[ fa(t)dt > [ stoae (u ER). (3) 
0 0 


By Lemma I.4.7, there exists a subsequence {F;,,,} converging weakly to a quasi- 
distribution function F’. Let f be its ch.f. By Fubini’s theorem and Theorem I.4.5, 


| ” fa, (t) dt = i: cae) dF, (2) 


10 


ee ih y= | F(t) dt. 


10 


By (3), it follows that f;' g(t) dt = J,° f(t) dt for all real u, and since both g and 
f are continuous at zero, it follows that f(0) = g(0) := lim f,,(0) = 1, that is, 
F(co) — F(—oo) = 1, hence necessarily F(—oo) = 0 and F(oo) = 1. Thus F' 
is a distribution function. By Corollary [.4.6, any distribution function that is 
the weak limit of some subsequence of {F;,} has the ch.f. g, and therefore, by 
Theorem 1.4.2, the full sequence {F;,} converges weakly to F’. 


We proceed now to prove Lyapounov’s central limit theorem. 


Lemma I.4.9. Let X be a real r.v. of class L" (r > 2), and let f := fx. Then 
for any non-negative integer n <r —1, 


F(u) = >) me (iu)*/k! + Rn(u), 
k=0 


with 
|Rn(u)| < E(X|"**)ul"**/(n + D!, 


for allue R. 


In particular, if X is a central L°-r.v., then 


f(u) =1—-07u?/2 + Ro(u), 


438 Application I Probability 


where o? := 07(X) and 


|Ra(u)| < E(|X|?)|ul*/3! (we R). 


Proof. Apply Theorem I.3.7 and Taylor’s formula. 


Consider next a sequence of independent central real r.v.’s Xz~,k = 1,2,... 
of class L°. Denote ox, := o(X;,). We assume that o, # 0 (ie., X_ is “non- 
degenerate”, which means that Xj, is not a.s. zero) for all k. We fix the following 
notation: 


fei= fx, Snc= Soe Sn = 0(S;,). 


Of course, s2 = S77_, of. In particular, s, 4 0 for all n, and we may consider 


the “standardized” sums S;,/s,. We denote their ch.f’s by dp: 


n 


dn (u ) = fs, fay Att ~ It (u/Sn). (4) 


Finally, we let 


n 


My = 5," RXR Ge Tyas): 


k=1 


Lemma I.4.10. Let {X;,} be as in the previous paragraph, and suppose that 
M,, — 0 (“Lyapounov’s condition”). Then dp (u) > e-"’/2 pointwise everywhere 
on R. 


Proof. By Lemma 1.3.6, for all k, 
OK S ||Xells- (5) 


Therefore, 


n: 


3\71/3 
Th a ’ < M3, k=1 


Ss 


pees 
n 


By (4) and Lemma I.4.9, 
log bn(u = Yat - % pw /2+ RS (u/sn)] 


and 


a ) 


[RE (u/sn)| < SSE |ul3 31. (7) 


I.4. Characteristic functions 439 
Write 


Nog bau) — (—2?/2)] = | Slog ji 2 [2+ FA u/sn) 


k=1 


n 


-»|-3 Fh 2/2 4 RM (u/s) | + + Melon 


k=1 Se 
< SS log(1 + 2%.n) — Zk,n} + S- RH (u/sn)|, (8) 
k=1 k=1 
where 
Ok 9 [é] 
thn = — aU /2+ Ry" (u/sn). 
By (6) and (7), 
Izkn| < M2/3u?/2+M,|ul>/3! (k <n). (9) 


Since M,, — 0 by hypothesis, there exists no such that the right-hand side of (9) 
is <1/2 for all n > no. Thus 


\Znn|<1/2 (k=1,...,n;n > 19). (10) 
By Taylor’s formula, for |z| < 1, 


lz? /2 


|log(1 + z)—2| < Gla? 


Hence, by (7) and (10), 


| log(1 + zkn) — Zkyn| < 2] 2%,n|? 


<(2) es (2) UBD «(EU os 


fork <nand n> no. By (6), the first summand is 


3 
< Mi EE uty, 
Sn 
The second summand is 
PARI) 
The third summand is 
X 
< M,, m ).6 /18 


440 Application I Probability 


Therefore (by (8)), 
| log Gn (u) — (—u?/2)| 


“~~ E(|Xnl? 
< (M}/Sut/2+ M2 |ul®/3 + Myu8/18 + |ul?/6] > PUNE) ) 
k=1 is 


= M4/3u4/2+ M2/3\ul5/3 + M2u®/18 + M,|u|?/6 > 0 


as 2 —> Oo. 


Theorem I.4.11 (The Lyapounov central limit theorem). Let {X;,} be a 
sequence of non-degenerate, real, central, independent, L3-r.v.’s, such that 


n 


lim s,° S> B(|Xq)*) =0. 


n—-Co 
k=1 
Then the distribution function of 


Dat Xk Sn 
Sn ~ a(S;,) 


converges pointwise to the standard normal distribution as n — oo. 


Proof. By Lemma I.4.10, the ch.f. of S,/s, converges pointwise to the ch-f. 
e~“’/2 of the standard normal distribution (cf. section 1.3.12). By Theorem I.4.8 
and Theorem I.4.2, the distribution function of S,,/s, converges pointwise to the 
standard normal distribution. 


Corollary 1.4.12 (Central limit theorem for uniformly bounded r.v.’s). 
Let {X;,} be a sequence of non-degenerate, real, central, independent r.v.’s such 
that |X;,| < K for allk € N and s,(:= o(S,)) > co. Then the distribution 
functions of Sp,/8, converge pointwise to the standard normal distribution. 


Proof. We have E(|X;|?) < Koz. Therefore 


sn? S” B(|Xx|°) < K/sn > 0, 
k=1 


and Theorem I.4.11 applies. 


Corollary 1.4.13 (Laplace central limit theorem). Let {A;,} be a sequence 
of independent events with PA, = p,0 < p< 1,k = 1,2,.... Let By, be the 
(Bernoulli) r.v., whose value is the number of occurrences of the first n events 
(“number of successes”). Then the distribution function of the “standardized 


Bernoulli r.v.” 


fs peered 
Bes rpg tt? 


converges pointwise to the standard normal distribution. 


I.5. Vector-valued random variables 441 


Proof. Let X;, = I, —p. Then X, are non-degenerate (since ¢?(X;,) = pq > 0 
when 0 < p < 1), real, central, independent r.v’s, and |X;| < 1. Also s, = 
(npq)'/? — 00 (since pq > 0). By Corollary 1.4.12, the distribution function of 
Sn/Sn = Be converges pointwise to the standard normal distribution. 


Corollary I.4.14 (Central limit theorem for equidistributed r.v.’s). Let 
{X;,} be a sequence of non-degenerate, real, central, independent, equidistributed, 
L-r.v.’s. Then the distribution function of Sp/8_ converges pointwise to the 
standard normal distribution. 


Proof. Denote (independently of k, since the r.v’s are equidistributed): 
E(|X;|°) = a; o*(X,) = 07(>0). 


Since X; are independent, s2 = no? by BienAyme’s identity, and therefore, as 


no, 
na = 
Mn = amare = (a/o?)n-¥?? + 0. 


The result follows now from Theorem I.4.11. 


I.5 Vector-valued random variables 


Let X = (X1,...,Xn) be an R”-valued r.v. on the probability space (Q,.A, P). 
We say that X has a density if there exists a non-negative Borel function h on 
R” (called a density of X, or a joint density of X1,...,Xn), such that 


PIX € B)= | hdx (B€B(R")), 
B 


where dz = dz, ...dz,, is Lebesgue measure on R”. 

When X has a density h, the later is uniquely determined on R” almost 
everywhere with respect to Lebesgue measure dz (we may then refer to the 
density of X). 

Suppose the density of X is of the form 


h(a) = u(a1,...,@~)U(@e41,---,Ln), (*) 


for some 1 < k < n, where u,v are the densities of (X,,...,X,) and 
(Xk41,---;Xn), respectively. Then for any A € B(R*) and B € B(R"-*), we 
have by Fubini’s theorem 


Pl Rigas EAP a eB 


= f ude .dog: f vidoe ..den 
A B 


= A(a,...,%n)dr1...dty, = P[(X,...,Xn) € Ax By 
AxB 


SP OGad se, Ne CA Cl Meas CXS Bl) 


442 Application I Probability 


Thus (Xj,...,X,) and (X,41,...,Xn) are independent. Conversely, if 
(X1,...,X,) and (Xp41,...,X,) are independent with respective densities u, v, 
then for all “measurable rectangles” Ax B with A, B Borel sets in R* and R"-*, 
respectively, 
PU(X1,-.-)Xn) € AX B] = PU((X1,---, Xe) © A]A [(Xegay--- Xn) € BY) 
5 POG uy XA Pl Beets Rae Bl 


= f ude ..dag: f vidoe ..dey 
A B 


=| uv dx ,...d&n, 
AxB 


P[X € H] =} uv dx 
H 
for all H € B(R”), that is, X has a density of the form (*). 
We proved the following. 


and therefore 


Proposition 1.5.1. Let X be an R”-valued r.v., and suppose that for some 
k ¢€ {1,...,n —1}, X has a density of the form h = uv, where u = 
u(a1,...,%~) and v = v(&p41,---,Ln) are densities for the r.v.’s (X1,...,Xp) 
and (Xp41,---,Xn), respectively. Then (Xy,...,X~) and (Xp41,...,Xn) are 
independent. Conversely, if, for some k as indicated, (X1,...,Xz) and 
(Xk41,---;Xn) are independent with densities u,v, respectively, then h := uv is 
a density for (X1,...,Xn). 


If the R"-r.v. X has the density h and x = x(y) is a C1-transformation of 
R” with inverse y = y(x) and Jacobian J # 0, we have 


PIXE B= f motwiseidy 
for all B € B(R"). 


Example 1.5.2. The distribution of a sum. 
Suppose (X,Y) has the density h : R? > [0,0o). Consider the transformation 


L=uUu-v; y=v 
with Jacobian identically 1 and inverse 
U=2+Y; V=y. 


The set 
B={(a,y) € R3r+y<c,y eR} 


corresponds to the set 


B’ ={(u,v) € R?;u<c,v € R}. 


5. Vector-valued random variables 443 


Therefore, by Tonelli’s theorem, 


Fxyy(c): = PIX +Y <c]l=P[(X,Y) € B =| h(u — v,v) dudv 


=f (f_me- wv do) du | 


This shows that the r.v. X + Y has the density 


hx+y(u) = a h(u — v,v) dv. 


—oco 


In particular, when X,Y are independent with respective densities hx and hy, 
then X + Y has the density 


hx+y (u) = [pxtu- v)hy(v) dv := (hx x hy )(u), 


(the convolution of hx and hy). 
Example 1.5.3. The distribution of a ratio. 


Let Y be a positive r.v., and X any real r.v. We assume that (X,Y) has a 
density h. The transformation 


L=uv; y=uv 


has the inverse 
u=a/y; v=y, 
and Jacobian J = v > 0. Therefore, 


Fx sy (0) = P[X/Y <¢] = P[(X,Y) € Bl, 


where 
B= {(x,y);-00 <2 < cy, y > 0} 


corresponds to 
B' = {(u,v);-00o <u<c,v > Of. 


Therefore, by Tonelli’s theorem, 


Fxjy(0) -/ hui idaite aa: Cs hwo, vjode a 


for all real c. This shows that X/Y has the density 


hxjv(w) = [ nuv, joao (u € R). 


444 Application I Probability 


When X,Y are independent, this formula becomes 
hx sy (u) >| hx(uv)hy(v)udu (uER). 
0 


Let X be an R”-valued r.v., and let g be a real Borel function on R”. The r.v. 
g(X) is called a statistic. For example, 


and 
S? = (1/n) >" (X, — X)? 


are the statistics corresponding to the Borel functions 
n 
E(x, se en) = (1/n) ae 
k=1 


and 7 
8 (Pig sata) =n) ) Soler — 2 Pisicadtall; 
k=1 


respectively. 7 
The statistics X and S? are called the sample mean and the sample variance, 
respectively. 


Theorem 1.5.4 (Fisher). Let X1,...,Xn be independent N(0,07)-distributed 


r.v.’s. Let = 
X, —X 
Zh t= er al nee 


Then 
(1) (Z,...,Zn—2), X, and S are independent; 
(2) (4,..+;Zn—2) has density independent of o; 
(3) X is N(0,0?/n)-distributed; and 
(4) nS? is x2-distributed, with n —1 degrees of freedom. 


Proof. The map sending X,,...,Xp to the statistics Z,,...,Zn—2,X, and S$? 
is given by the following equations: 


Lee 
Ze = ——, k=1,...,n-2; 


5. Vector-valued random variables 445 


Note the relations (for z, defined as in (1) for all k =1,...,n). 


k=1 k=1 
n 
‘yo xt = n(z)* + ns? (3) 
k=1 
By (2), 
n—-2 
Zn—-1 + 2m =U ( _ -) 
k=1 
and 
n-2 
a =Ww (=»- 3] 
k=1 
Thus, 
(u— 2)? +22 =, 
that is, 
22 — uz, + (vu? — w)/2=0 
Therefore, 


Zn = (ut v)/2; Zn—-1 = (u—v)/2, 
where v := ¥2w — u?; a second solution has v replaced by —v. Note that 
2w—w? = 2zn-1 of QZ, = (ai a) = (Zn-1 — oy) 2 0, 


so that v is real. 
The inverse transformations are then 
Ly = 24+ szz,k =1,...,n —2; 
In-1 =£+ s(u—v)/2; 


Ln =E4+s8(utv)/2, 


with v replaced by —v in the second inverse; u,v are themselves functions of 
Z15+++5%n—2> 
The corresponding Jacobian 


(7 O(@1,.--,2n) 
" O(41,---,2Zn—2, 2, 8) 
has the form 
Cie a ereee ea 
where g is a function of z1,..., Z,—2 only, and does not depend on the parameter 


o” of the given normal distribution of the X;,(k =1,...,n). 


446 Application I. Probability 


Replacing v by —v only interchanges the last two columns of the determinant, 
so that |J| remains unchanged. Therefore, using (3), 


hx (ax) dx = (2m0?)~"/2e7 Lia wp /20” des... dan 
= 2(2002)—/ 2g (mB)? +n87)/207) gn—2g(z) 2 9) 
x dz, sae dzn—2 dz ds, 


where the factor 2 comes from the fact that the inverse is bivalued, with the 
same value of |J| for both possible choices, hence doubling the “mass element”. 
The last expression can be written in the form 


hy(Z) dz ho(s) ds h3(z1, sey Zn—2) dz, ee dzn_2, 


where ‘ 
ha(z) = tb en ((@)"/2lo/Vm)) 
ne) V2ro//n 

is the N(0,07/n)-density; 
py (r-1)/2 gn—2 


= —ns? /20? 
~ 2@-3)/2P((n — 1)/2)or-1° ee ®) 


h2(s) 


(and h2(s) = 0 for s < 0) is seen to be a density (i.e., has integral = 1); and 


P((n— 1/2) 


n/2qtnatyf29 (21) CEES 2n—2) 


h3(z1, ey 15 n= 2) = 


is necessarily the density for (Z1,..., Zn—2), and clearly does not depend on o?. 
The shown decomposition implies by Proposition I.5.1 that Statements 1-3 
of the theorem are correct. Moreover, for x > 0, we have 


F,92(x) = P[nS? < 2] = P[S < \/2/n] = 


yee és 


Therefore 


hing2 (x) *= < ng2(a) = ha(V/x/n)(1/2)(a/n)~/?(1/n) 
gp (-1)/2-1_-2#/20? 


(20?) DPT ((n — 1)/2). 


This is precisely the y? density with n — 1 degrees of freedom. 
Theorem [.5.4 will be applied in the sequel to obtain the distributions of some 
important statistics. 


Theorem 1.5.5. Let U,V be independent r.v.’s, with U normal N(0,1) and V 
yi -distributed with v degrees of freedom. Then the statistic 


U 


JV /v 


= 


5. Vector-valued random variables 447 


has the density 
hy () =v V7? BA/2,v/2) 11 +e/v)-“YP (te R), 
called the “t-density” or the “Student density with v degrees of freedom”. 
In the previous formula, B(-,-) denotes the beta function: 


T(s)P(t) 
B(s,t) = TG +8) 
(s,t > 0). 
Proof. We apply Example 1.5.3 with the independent r.v’s X = U and Y = 
/V/v. The distribution Fy (for y > 0) is given by 
Fy(y):= PL /V/v < y] = PlV < vy’) 
2 
vy 
=~ 2-vr(y/2) f g(¥/2)-19-s/2 dg 
0 


(cf. Section 1.3.15). The corresponding density is (for y > 0) 


d pY/2 


hy (y) = 7 Fy(y) 


Sa Vel —vy? /2 
y 2=-1T(y/2)% © 


and of course hy(y) = 0 for y < 0. 
By hypothesis, hx (x) = e~*’/2/./2m. By Example 1.5.3, the density of T is 
n= | Actoieceneds 
0 
y/2 foe) 
= ue | eo FY) dy, 
V272”/2-1T(v/2) Jo 
Write s = v?(t? + v)/2: 


pY/2 


hr) = arpa + yee 


/ e 8 g44t1)/2-1 ds. 
0 


Since ['(1/2) = \/7, the last expression coincides with h,(t). 


Corollary 1.5.6. Let X1,...,Xn be independent N(,o7)-r.v.’s. Then the 
statistic 


X—p 


T:= vn—-1 (*) 


has the Student distribution with v =n —1 degrees of freedom. 


Proof. Take U = (X —p)/(o/\/n) and V = nS?/o?. By Fisher’s theorem, U,V 
satisfy the hypothesis of Theorem 1.5.5, and the conclusion follows (since the 
statistic T in Theorem I.5.5 coincides in the present case with (*)). 


448 Application I. Probability 


Corollary 1.5.7. Let X1,...,Xn and Yi,...,¥m be independent N(p, o*)-r.v.’s. 

Let X, Y, S%-, and S%. be the “sample means” and “sample variances” (for the 
“samples” X1,...,Xn and Yi,...,¥m). Then the statistic 

xX=-Y 

W := J(n +m — 2)nm/(n +m) 


Jn + mS2 


has the Student distribution with vy =n +m — 2 degrees of freedom. 


Proof. The independence hypothesis implies that X and Y are independent 
normal r.v’s with parameters (u4,07/n) and (u,07/m), respectively. Therefore 
X —Y is N(0,07(n + m)/nm)-distributed, and 


x-Y 


OC — 
oV/(n+m)/nm 


is N(0, 1)-distributed. 

By Fisher’s theorem, the r.v.’s n$%-/o? and mS?./o? are x?-distributed with 
n—1and m-— 1 degrees of freedom, respectively, and are independent (as Borel 
functions of the independent r.v’s X,,...,X, and Yj,...,¥m, resp.). Since the 
x?-distribution with r degrees of freedom is the Gamma distribution with p = r/2 
and b = 1/2, it follows from Proposition 1 in Section 1.3.15 that the r.v. 


nS% + mS? 


V:= ; 


a 
is yj-distributed with v = (n — 1) + (m—1) degrees of freedom. 

Also, by Fisher’s theorem, U,V are independent. We may then apply 
Theorem 1.5.5 to the present choice of U,V. An easy calculation shows that 
for this choice T = W, and the conclusion follows. 


Remark I.5.8. The statistic T is used in “testing hypothesis” about the 
value of the mean yu of a normal “population”, using the “sample outcomes” 
Xj,...,Xn. The statistic W is used in testing the “zero-hypothesis” that two 
normal populations have the same mean (using the outcomes of samples taken 
from the respective populations). Its efficiency is enhanced by the fact that 
it is independent of the unknown parameters (1,07) of the normal population 
(cf. Section 1.6). 

Tables of the Student distribution are usually available for v < 30. For v > 30, 
the normal distribution is a good approximation. The following theorem supports 
this fact: 


Theorem [.5.9. Let h, be the Student distribution density with v degrees of 
freedom. Then as v - co, h, converges pointwise to the N(0,1)-density, and 


b 
jim Pla<T <d]= a/vin) f et /2 at (*) 


for all real a < b. 


I.5. Vector-valued random variables 449 


Proof. By Stirling’s formula, [(n) is asymptotically equal to (n/e)”. Therefore 


(since ['(1/2) = 7) 


vV/? B(1/2,v/2) 
waot oe 
T(v/2) ((v + 1)/2e)¢tY)/2 el/2 


= lim . : 


v (v/2eyr2— T((v+1)/2) AF L/P 


Hence, as v > oo, 
hy(t) = [v¥/2B(1/2,v/2)}- 10 + 2 /v) 2-2 (1 Vame-?/2. 
For real a < 8, 


dt 
1+ PEO 


b 
Plast <= pBa/2v/2y- f 


The coefficient before the integral was seen to converge to 1/27; the integrand 


converges pointwise to e~t/2 and is bounded by 1; therefore (*) follows by 
dominated convergence. 


Theorem 1.5.10. Let U; be independent \?-distributed r.v.’s with v; degrees of 
freedom (i = 1,2). Assume Uz > 0, and consider the statistic 


Fs= Cvs, 
U2/v2 
Then F has the distribution density 
yrale pale yi/2-1 


h(u: = 
(u; 1, V2) Biv, /2, 2/2) (vu Ae V2) (vitve)/2 


foru>0 (and =0 foru <0). 


The discussed density is called the “F-density” or “Snedecor density” with 
(11,2) degrees of freedom. 


Proof. We take in Example 1.5.3 the independent r.v’s X = U;/™ and Y = 
U2/v2. We have for x > 0: 


YzAxr 
Fx (2) = P[U, < a] = [24/7T (v1 /2)]7* | pide Nel? ge 
0 


and Fy (x) = 0 for x < 0. Therefore, for x > 0, 


—— (1 / yay? — =, 
hx(x) = —F (a f2yn" 1/2-1,-1%/2 
(x) ] ‘x (x) (1/2) x e : 


450 Application I Probability 


and hx (x) = 0 for x < 0. A similar formula is valid for Y, with 12 replacing 1. 
By Example 1.5.3, the density of F := X/Y is 0 for u < 0, and for u > 0, 


hr(u) = _ hx (uv)hy (v)v du 


= (1, /2)"1/? (vy /2)¥2/? i. V1 /2-1, vo/2—(viwv+vev)/2 * 
EGR) ye Hee 


The integral is 
yr/2-} ie en U(riutv2)/2a)(¥1+¥2)/2—1 day. 
0 
Making the substitution v(viu + v2)/2 = s, the integral takes the form 


yn/2-1 


[(viw + v9) /ot2)/2 


| eS g(4itve)/2-1 de. 
0 


Since the last integral is [((m + v2)/2), it follows from (*) that hr(u) = 
h(u;v1,V2), for h as in the theorem’s statement. 


Corollary 1.5.11. Let X1,...,Xn and Yi,...,Ym be independent N(0,07)- 
distributed r.v.’s. Then the statistic 


F:= —(1-1/m)/(—-1/n) 


has Snedecor’s density with vy, =n —1 and ve =m—1 degrees of freedom. 


Proof. Let U; := nS%/o? and Uz := mS%./o*. By Fisher’s theorem, U; are 
x7-distributed with v; degrees of freedom (i = 1,2). Since they are independent, 
the r.v. F = (U,/1™)/(U2/v2) has the Snedecor density with (11, v2) degrees of 
freedom, by Theorem I.5.10. 


1.5.12. The statistic F of Corollary 1.5.11 is used, for example, to test the 
“zero hypothesis” that two normal “populations” have the same variance (see 
Section 1.6). Statistical tables give ua, defined by 


PIF > ua] (- if h(u; 14, V2) iu) =a, 


a 


for various values of a and of the degrees of freedom 1;. 


1.6 Estimation and decision 


We consider random sampling of size n from a given population. The n outcomes 
are a value of a R”-valued r.v. X = (X,...,Xn), where X;, have the same 
distribution function (the “population distribution”) F'(.;0); the “parameter 


6. Estimation and decision 451 


vector” 6 is usually unknown. For example, a normal population has the 
parameter vector (j1,07), etc. 

Estimation is concerned with the problem of “estimating” @ by using 
the sample outcomes Xj,...,Xn, say, by means of some Borel function of 
Xi, ate Xin 

6* := g(X1,..., Xn). 


This statistic is called an estimator of 0. 
Consider the case of a single real parameter 6. 
A measure of the estimator’s precision is its mean square deviation from 0, 


E(6* — 6)’, 


called the risk function of the estimator. 
We have 


E(6* — 0)? = E[(6* — E6*) + (E0* —@)]? 
= E(6* — £6")? + 28 (6" — £6*) . (Be* — 6) + (E0* — 0)?. 


The middle term vanishes, so that the risk function is equal to 
02(0") + (B0* — 0)?. (*) 


The difference 6 — E@* is called the bias of the estimator 0*; the estimator is 
unbiased if the bias is zero, that is, if EO* = 6. In this case the risk function is 
equal to the variance 07(6*). 


Example I.6.1. We wish to estimate the expectation yu of the population 
distribution. Any weighted average 


n 
w= So aX (ax > 0, )>ax = 1) 
k=1 
is a reasonable unbiased estimator of pu: 


Ep = S- a, EX, = x Agpl = pl. 


By BienAyme’s identity (since the X; are independent in random sampling), the 
risk function is given by 


o?(u*) = S> azo?(Xx) = J azo?, 
k=1 


where o? 


Yo ar =1, 


is the population variance (assuming that it exists). However, since 


So ak = So (ax —1/n)? +1/n>1/n, 


and the minimum 1/n is attained when a, = 1/n for all k. Thus, among all 
estimators of y that are weighted averages of the sample outcomes, the estimator 


452 Application I Probability 


with minimal risk function is the arithmetical mean p* = X; its risk function 
ae, 
is o*/n. 


Example I.6.2. As an estimator of the parameter p of a binomial population 
we may choose the “successes frequency” p* := S,,/n (cf. Example 1.2.7). The 
r.v. p* takes on the values k/n (k =0,...,7), and 


Pip’ = /n] = PSs = 8] = (7) phar 


The estimator p* is unbiased, since 
Ep* = ES,,/n = np/n = p. 


Its risk function is 
o*(p*) = npgq/n? = pq/n < 1/4n. 


By Corollary 1.4.13, (p* — p)/,/pq/n is approximately N(0, 1)-distributed for n 
“large”. Thus, for example, 


P{|p* — p| < 2\/pq/n] > 0.95, 


and since pg < 1/4, we surely have 
P{\p* — p| < 1/Vn] > 0.95. 


Thus, the estimated parameter p lies in the interval (p* —1/./n, p* +1/,/n) with 
“confidence” 0.95 (at least). We return to this idea later. 


In comparing two binomial populations (e.g., in quality control problems), 
we may wish to estimate the difference p, — po of their parameters, using samples 
of sizes n and m from the respective populations. A reasonable estimator is the 
difference V = S,,/n — S',/m of the success frequencies in the two samples. The 
estimator V is clearly unbiased, and its risk function is (by BienAyme’s identity) 


o?(V) = 0°(S,,/n) + 02(S1,/m) = pign/n + poga/m < 1/4n+1/4m. 


For large samples, we may use the normal approximation (cf. Corollary 1.4.13) 
to test the “zero hypothesis” that p; — po = 0 (ie., that the two binomial 
populations are equidistributed) by using the statistic V. 


Example I.6.3. Consider the two-layered population of Example 1.2.8. An 
estimator of the proportion N,/N of the layer P, in the population, could be 
the sample frequency U := D,/s of P\-objects in a random sample of size s. 
We have EU = (1/s)(sNi/N) = Ni/N, so that U is unbiased. 
The risk function is (cf. Example I.2.8) 


—sN, Ny 
1 1/4s. 
=F ( N) <4s 


o?(U) = (1/5) 5 


6. Estimation and decision 453 


Example 1.6.4. The sample average X is a natural unbiased estimator \* of 
the parameter of a Poissonian population. Its risk function is 0?(A*) = A/n 
(cf. Example 1.3.9). 


Example I.6.5. Let X1,...,X» be independent N(,07)-r.v.s. For any weights 
@1,---,@y, the statistic 


ia) . ee ee 
Ve paaes =y4 | 
k=l 
is an unbiased estimator of 7. Indeed, 
E(X)? = 0°(X) + [EXP = 0? /n +p’, 


and therefore, 


Bye = —" (1 -1/n)o? = o?. 
n—-1 n—-1 


Bs an (0? +p?) — (0? /n + p”) 
k 


When ax = 1/n for all k, the estimator V is the “sample error” 
ites Z 
V =nS?/(n—1) =——_ S_(X, — X)?. 
ns*/(e— 1) = Ke ¥) 


Example 1.6.6. Let X1, X2,... be independent r.v.’s with the same distribution. 
Assume that the moment j2, of that distribution exists for some r > 1. For each 
n, the arithmetical means 


are unbiased estimators of the rth moment pu, of the distribution. By 
Example 1.2.14 applied to Y, = Xf, Mrn — br in probability (as n > oo). We 
say that the sequence of estimators {m;n}n of the parameter pu, is consistent 
(the general definition of consistency is the same, mutatis mutandis). 

If {0% } is a consistent sequence of estimators for @ and {a,,} is a real sequence 
converging to 1, then {a,,0* } is clearly consistent as well. Biased estimators could 
be consistent (start with any consistent sequence of unbiased estimators 6*; then 
((n — 1)/n)6* are still consistent estimators, but their bias is 6/n 4 0 (unless 
6=0)). 


1.6.7. Maximum likelihood estimators (MLEs). 

The distribution density f(a1,...,%n3;0) of (X1,...,Xn), considered as a 
function L(0), is called the likelihood function (for X1,...,Xn). The MLE (for 
X1,...,Xn) is 0*(X1,..., Xn), where 6* = 6*(x1,...,2n) is the value of 6 for 
which L(@) is maximal (if such a value exists), that is, 


f(@1,.--,0nj3 0") > f(ai,...,2n3 8) 


454 Application I Probability 


for all (a1,...,2%n) € R” and @ in the relevant range. Hence, 
Po» (|X S B| os Po [|X ics B] 


for all B € B(R”) and 6, where the subscript of P means that the distribution 
of X is taken with the parameter indicated. 


We consider the case when X1,..., X» are independent N(j,07)-r.v.s. Thus 
L(6) = (2702) —7/2@ Zee w)?/20° 6 = (1,07). 
Maximizing L is equivalent to maximizing the function 
(0) = log L(6) = —(n/2) log(2n0”) — So (ax — 1)?/20?. 


Case 1. MLE for yu when o is given. The necessary condition for u = p*, 


dg 
ou = S- (ex - p)/0? =0, (1) 
implies that w* = p*(a1,...,%n) = &. This is indeed a maximum point, since 
o2 


Thus the MLE for yw (when a is given) is 
[i (Xet ky) SX. 
Case 2. MLE for o? when ju is given. The solution of the equation 


5 = —n/20? + (1/204) S “(ap — 1)? =0 (2) 


(07)* = (1/n) So (ae — H)?. 


Since the second derivative of ¢ at (a7)* is equal to —n/2[(o7)*|? < 0, we 
obtained indeed a maximum point of ¢, and the corresponding MLE for o? is 


(07)*(X1,..-,Xn) = (1/n) 5 (Xp — w)?. 


Case 3. MLE for 0 := (1,07) (as an unknown vector parameter). We need to 
solve the equations (1) and (2) simultaneously. From (1), we get y* = Z; from 
(2) we get 


(07)* = (1/n) So (ax = uy? _ (1/n) SG = z)? = 8%, 


The solution (Z, s”) is indeed a maximum point for ¢, since the Hessian for ¢ at 
this point equals n?/2s° > 0, and the second partial derivative of ¢ with respect 
to p (at this point) equals —n/s? < 0. Thus the MLE for @ is 


OF Xiyes ose) =; G52): 


6. Estimation and decision 455 


Note that the estimator S? is biased, since 


but consistent: indeed, 


S? = (1/n) 0 XE - fama 


k=1 k=1 


by the weak law of large numbers, the first average on the right-hand side 
converges in probability to the second moment pz, while the second average 
converges to the first moment 4; = pu; hence S$? converges in probability to 
[2 — wp? = 07 (when n — 00). 


1.6.1 Confidence intervals 


1.6.8. Together with the estimator 0* of a real parameter 6, it is useful to have 

an interval around 6* that contains 6 with some high probability 1—«a (called the 
confidence of the interval). In fact, 0 is not a random variable, and the rigorous 
approach is to find an interval (a(@), b(@)) such that 


P(6" € (a(6),b(8))] = 1- (3) 
The corresponding interval for 0 is a (1 — a)-confidence interval for @. 
Example 1. Consider an N(,07)-population with known variance. Let 
X— yu 
Zi= : 
a/J/n 


Then Z is N(0,1)-distributed. In our case, (3) for the estimator X of pz takes 
the equivalent form 


a(u) = b() = me] 
Glyn olyn | 


For simplicity, take a symmetric Z-interval, 


<Z< — a. (4) 


Bu)=H 
a/J/n , af J/n ‘ 
that is, 
a(w)=p—eo/Jn; (Mm) = +ca/Yn. 
Let ® denote the N(0,1)-distribution function. By symmetry of the normal 
density, 
®(—c) = 1- (0), 


and therefore (4) takes the form 


1—a= ®(c) — ®(-c) = 28(c) - 1, 


456 Application I Probability 
that is, 
®(c) =1-a/2, 

and we get the unique solution for c: 

e= O° 1(1-0)2) = 2a (5) 
By symmetry of the normal density, —c = zg/2. The interval for the estimator 
X is then 

Wt Zo/20//n := a(p) < X < (wu) = wt A-apeo/Vn, 


and the corresponding (1 — a)-confidence interval for ju is 


X = 4-aj2o//n < ieee. 4 = Za/20//n. 


Example 2. Consider an N (u,07)-population with both parameters unknown. 
We still use the MLE p* = X. By Corollary 1.5.6, the statistic 


_X=# 
v3 

has the Student distribution with n — 1 degrees of freedom. By symmetry of the 
Student density, the argument in Example 1 applies in this case by replacing ® 
with Fr,-1, the Student distribution function for n — 1 degrees of freedom, and 


a//n by S/V/n—T. Let 


T: 


n—-1 


tyn—1 = Fon?) 


Then a (1 — a)-confidence interval for ju is 
X tew/an-15/V ee MS X- ta/an—15/Vn =. 


Example 3. In the context of Example 2, we look for a (1 — a)-confidence 
interval for o?. By Fisher’s theorem, the statistic V := nS?/o? has the yj? 
distribution with n — 1 degrees of freedom (denoted for simplicity by F,,_1). 
Denote 

Xin-1 = Friily) (ve R). 
Choosing 

a= ee b= NEE elias 
we get 
Pla<V <0) = F,_-1(b) — Fy_1(a) = (1-a/2) —a/2=1-a, 

which is equivalent to 

ns? ns? 


de a 


5 =1-a, 
X1—a/2,n—1 


2 
Xa/2,n—1 


from which we read off the wanted (1 — a)-confidence interval for 07. 


6. Estimation and decision 457 


1.6.2 Testing of hypothesis and decision 


1.6.9. Let X1,...,X, be independent r.v’s with common distribution F‘(.; 6), 
with an unknown parameter 0. A simple hypothesis is a hypothesis of the form 


Ao : 0 = 9. 


This is the so-called “zero hypothesis”. 
We may consider an “alternative hypothesis” that is also simple, that is, 


Ay :6= 04. 
Let Po, (i = 0,1) denote the probability of any event “involving” X1,...,Xn, 
under the assumption that their common distribution is F'(.; 6;). 
The set C € B(R”) is called the rejection region of a statistical test if the 
zero hypothesis is rejected when X € C, where X = (X1,..., Xn). 
The significance of the test is the probability a of rejecting Hp when Ho is 
true. 
For the simple hypothesis Ho, 
a =a(C) := Po, [X € C]. 
Similarly, the probability 
B = B(C) = Po, [X € C] 
is called the power of the test. It is the probability of rejecting Ho when the 
alternative hypothesis H, is true. 
It is clearly desirable to choose C' such that a is minimal and ( is maximal. 
The following result goes in this direction. 
Lemma I.6.10 (Neyman-—Pearson). Suppose the population distribution has 
the density h(.;0). For k © R, let 
Or = { (ery stn) ER" TT] bles) > RTT acy to) 
j=l j=l 


Then among all C € B(R”) with a(C) < a(C;), the set Cy has maximal power. 


In symbols, 6(C) < 8(C,) for all C with a(C) < a(C;,). 


Proof. Let C € B(R”) be such that a(C) < a(C,). Denote D= CN C,. Since 


C-D=CNCe CC, 


458 Application I Probability 


we have 
B(C — D) = BIC NCE) := Pa, |X € CNY 
cnc; cnc 


= kPo,[X € C — D] = kPo,[X € C] — kPo,[X € D] 
< k Po, [X E Ck] _ k Po, [xX E D = k Po, [xX ECk- D] 


=i f h(a;;00)da1--+ dtp, 
c.-vil rE!) Ay 


s) | 2053 61) dara ++ dit, = Po, [X € Cy — D] = B(Cy - D). 
C.-D°; 


Hence 


B(C) = B(C — D) + B(D) < B(Ck — D) + BD) = B(Cz). 


Note that the proof does not depend on the special form of the joint density 
of (X1,..., Xn). Thus, if C, is defined using the joint density (with the values 6; 
of the parameter), the Neyman—Pearson lemma is valid without the independence 
assumption on X1,...,Xn. 


Application I.6.11. Suppose F is the N(y,07) distribution with 0? known, 
and consider simple hypothesis 


Ay: w=pi, t7=0,1. 
For k > 0, we have (by taking logarithms): 


Ch = {ter +4y@n) ER"; (—1/20?) 5° [(ay — pi)? — (25 — 40)?] > log x 


= { (x1, 1+ +32n)3 (Ma — Lo) [s- x; — (pH + j00)/2| > 0" logk}, 


Denote 


Then if ~4 > po, 
Cr = {(21,.--,2n);£ > k*}, 


and if uw, < po, 
Ch = {(@1,.--,0n);E < k*}. 


We choose the rejection region C = C, for maximal power (by the Neyman— 
Pearson lemma). Note that it is determined by the statistic X: Ho is rejected 


6. Estimation and decision 459 


if X > k* (in case fy > po). The critical value k* is found by means of the 
requirement that the significance be equal to some given a: 


a(Cy) =a, 


that is, when p11 > wo, _ 
Pug |X Pa k*| =a. 


Since X is N(p9,07/n)-distributed under the hypothesis Ho, the statistic 
_ X= Mo 
a/J/n 
is N(0, 1)-distributed. Using the notation of Example 1 in Section 1.6.8, we get 


onal =r o( a), 


so that : 
k* — bo 


o/\/n = 21-a; 


and 
k* = po + 4 -ao/Vn. 


We thus arrive to the “optimal” rejection region 


Cy = {(21,---,%n) € R"3% > wo + A-ao/Vn} 


for significance level a (in case j11 > [0). 
An analogous calculation for the case tw, < po gives the “optimal” rejection 
region at significance level a 


Cy = {(#1,---,2n)5E < po + Za0/Vn}. 


Application I.6.12. Suppose again that F is the N(y,07) distribution, this 
time with w known. Consider the simple hypothesis 


H,:o=0;, 1=0,1. 


We deal with the case a1 > go (the other case is analogous). 
For k > 0 given, the Neyman-—Pearson rejection region is (after taking 
logarithms) 


C, = r= (@1,---,2n) € R"; (2; — pi)? > wh 


= 


where 
_ 2logk + nlog(o1/a0) 


= 2 : 
0° — Oj 


k*: 


460 Application I Probability 


We require the significance level a, that is, 
a =a(Cy) = Po, byes =p) >k* |. 
Since (X; — s)/o§ are independent N(0,1)-distributed r.v’s (under the 


hypothesis Ho), the statistic 


n 


x? = (1/03) $9 (X5 - nu)? 


j=1 


has the standard x7 distribution with n degrees of freedom (cf. Section 1.3.15, 
Proposition 2). Thus 


0 = Pro[x? > k*/o3] = 1 — Fg (k*/02). 


Denote 
cy =Fa'(y) (y>0) 


(for n degrees of freedom). Then k* = ofci-a, and the Neyman—Pearson 
rejection region for Ho at significance level a is 


C, = {sx € R”; 5 (2; —p)> aaa. 


1.6.3 Tests based on a statistic 
1.6.13. Suppose we wish to test the hypothesis 


Ho :0= Ao 
against the alternative hypothesis 
Ay :0 x A 


about the parameter 6 of the population distribution F' = F(.; 6). 

Let Xy,...,X, be independent F-distributed r.v’s (i.e., a random sample 
from the population), and suppose that the distribution F4:x) of some statistic 
g(X) (where g : R” > R is a Borel function) is known explicitely. Denote this 
distribution, under the hypothesis Ho, by Fo. It is reasonable to reject Ho when 
g(X) > c (“one-sided test”) or when either g(X) < a or g(X) > b (“two-sided 
test”), where c and a < b are some “critical values”. The corresponding rejection 
regions are 

C = {x € R"; g(x) > c} (6) 
and 
C = {x € R";g(2) < aor g(x) > db}. (7) 


For the one-sided test, the significance a requirement is (assuming that Fo is 
continuous): 
a = a(C) == Pa. [g(X) > e] = 1— Foe), 


6. Estimation and decision 461 


and the corresponding critical value of c is 
Ca = Fp *(1—@). 


In case (7) (which is more adequate for the “decision problem” with the 
alternative hypothesis H,), it is convenient to choose the values a, b by requiring 


Poolg(X) < a] = Polg(X) > b] = a/2, 
which is sufficient for having a(C) = a. The critical values of a,b are then 
dg = Fy (e/2); by = Fy (1 — a/2). (8) 


Example I.6.14. The z-test 
Suppose F is the normal distribution with known variance. We wish to test 
the hypothesis 
Ao : 6 = Ho 
against 
A: wF po. 


Using the N(0, 1)-distributed statistic Z as in Application 1.6.11, the two-sided 
critical values at significance level a are 


Ga = %a/23 be = 21-a/2- 


By symmetry of the normal density, ag = —ba. The zero hypothesis is rejected 
(at significance level a) if either Z < zg/2 or Z > 21~«/2, that is, if |Z] > z1-a/2- 


Example I.6.15. The t-test 
Suppose both parameters of the normal distribution F’' are unknown, and 
consider the hypothesis H; of Example 1.6.14. By Corollary I.5.6, the statistic 


has the Student distribution with n — 1 degrees of freedom. With notation as in 
Section 1.6.8, Example 2, the critical values (at significance level a, for the test 
based on the statistic T) are 


aa = Ca/2n-13 ba = t1_a/2,n-1- 


By symmetry of the Student density, the zero hypothesis is rejected (at 
significance level a) if |T| > t1-a/2,n-1- 


Example I.6.16. Comparing the means of two normal populations. 

The zero hypothesis is that the two populations have the same normal 
distribution. 

Two random samples X1,...,X, and Yj,...,Y, are taken from the 
respective populations. Under the zero hypothesis Ho, the statistic W of 


462 Application I Probability 


Corollary 1.5.7 has the Student distribution with vy = n +m — 2 degrees of 
freedom. By symmetry of this distribution, the two-sided test at significance 
level a rejects Ho if |W| > ty_a/2- 


Example I.6.17. Comparing the variances of two normal populations. 

With Ho, X, and Y as in Example I.6.16, the statistic F’' of Corollary I.5.11 
has the Snedecor distribution with (n — 1,m — 1) degrees of freedom. If Fo 
is this distribution, the critical values at significance level a are given by (8), 
Section 1.6.13. 


I.7 Conditional probability 


1.7.1 Heuristics 


Let (Q,A,P) be a probability space, and A;,B; € A. Consider a two-stage 
experiment, with possible outcomes A,...,Am in Stage 1 and B,,...,B, in 
Stage 2. 

On the basis of the “counting principle”, it is intuitively acceptable that 


P(A; 1 Bj) = P(Ai) P(Bj|Ai), (1) 


where P(B;|A;) denotes the so-called conditional probability that B; will occur 
in Stage 2, when it is given that A; occurred in Stage 1. We take (1) as the 
definition of P(B,;|A;) (whenever P(A;) ¥ 0). 


Definition I.7.1. If A € A has PA 4 0, the conditional probability of B € A 
given A is 
P(ANB) 

PA. -* 
It is clear that P(.|A) is a probability measure on A. For any L(P) real r.v. X, 
the expectation of X relative to P(.|A) makes sense. It is called the conditional 
expectation of X given A, and is denoted by F(X |A): 


P(BIA) := 


E(X|A) := [ xaPcia)=(/PA) [ xar. (2) 


Equivalently, 
E(X|A)PA = | XdP (A€A,PAF0). (3) 
A 


Since P(B|A) = E(Ip|A), we may take the conditional expectation as the basic 
concept, and view the conditional probability as a derived concept. 

Let {A;} C A bea partition of Q (with PA; 4 0), and let Ap be the o-algebra 
generated by {A;}. Denote 


E(X|Apo) : seh (X|Aj)I (4) 


I.7. Conditional probability 463 


This is an Ap-measurable function, which takes the constant value E(X|A;) on 
A; (i=1,2,...). Any A € Ag has the form A =[),. Ai, where J CN. By (3), 
‘| E(X|Ao) dP = S> E(X|A;)P(Ai 9A) = $> E(X|A,) PA; 
a i ie J 
=} xap= | xap (A € Ap). (5) 
ep A 

Relation (5) may be used to define the conditional expectation of X, given the 
(arbitrary) o-subalgebra Ao of A. 
Definition 1.7.2. Let Ao be a o-subalgebra of A, and let X be an L'(P)- 


real r.v. The conditional expectation of X given Ap is the (P-a.s. determined) 
Ao-measurable function E(X|Ao) satisfying the identity 


[ exdoap =f xap (AE Ay). (6) 
A A 


Note that the right-hand side of (6) defines a real-valued measure v on Ap, 
absolutely continuous with respect to P (restricted to Ao). By the Radon— 
Nikodym theorem, there exists a P-a.s. determined Ap-measurable function, 
integrable on (Q,Ag,P), such that (6) is valid. Actually, E(X|Ajo) is the 
Radon-Nikodym derivative of v with respect to (the restriction of) P. 

The conditional probability of B € A given Apo is then defined by 


P(B|Ao) := E(Ip|Ao). 


By (6), it is the P-a.s. determined Ap-measurable function satisfying 
| P(B\|Ap)dP = P(ANB) (A€ Ap). (7) 
A 


We show that E(X|Ao) defined by (6) coincides with the function defined before 
for the special case of a o-subalgebra generated by a sequence of mutually disjoint 
atoms. The idea is included in the following. 


Theorem 1.7.3. The conditional expectation E(X|Ao) has a.s. the constant 
value E(X|A) on each P-atom A € Ap. (A € Ao is a P-atom if PA > 0, and A 


is not the disjoint union of two Ap-measurable sets with positive P-measure.) 


Proof. Suppose f : Q — [—co, oo] is Ap-measurable, and let A € Ap be a 
P-atom. We show that f is a.s. constant on A. 
Denote A, := {w € A; f(w) < x}, for x > —oo. 
By monotonicity of P, if oo < y < x < ooand PA, = 0, then also PA, = 0. 
Let 
h = sup{z; PA, = 0}. 


Then PA, =0 for all x < h. Since 


An = U A,, 


r<h;rEeQ 


464 Application I Probability 


we have 
PA), =0. (8) 


By definition of h, we have PA, > 0 for x > h, and since A is a P-atom 
and A, € Ao (because f is Ap-measurable) is a subset of A, it follows that 
P{w € A; f(w) > x} =0 for all x > h. Writing 


we Asfw)>h}= LU {we Asfw)>r}, 
r>h;rEQ 
we see that P{w € A; f(w) > h} = 0. Together with (8), this proves that 
Piw € A; fw) Fh} =0, 


that is, f(w) =h P-a.s. on A. 
Applying the conclusion to f = E(X|Ao), we see from (6) that this constant 
value is necessarily E(X|A) (cf. (2)). 


We collect some elementary properties of the conditional expectation in the 
following. 


Theorem I.7.4. 


(1) E(E(X|Ap)) = BX. 
(2) If X is Ag-measurable, then E(X|Ao) = X a.s. (this is true in particular 
for X constant, and for any r.v. X if Ao = A). 


(3) Monotonicity: for real r.v.’s X,Y € L*(P) such that X 
E(X|Ao) < E(Y|Ao) as. (in particular, since —|X| < 
|E(X|Ao)| < E(|X||Ao) a.s.). 


(4) Linearity: for X,Y € L1(P) anda,B€C, 


Y as. 
S 


< 
Xs |A, 


E(aX + BY|Ao) = akE(X|Ao) + BE(Y|Ao) as. 
Proof. 
(1) Take A= 1) in (6). 
(2) X is Ao-measurable (hypothesis!) and satisfies trivially (6). 
(3) The right-hand side of (6) is monotonic; therefore 


[Boar < [ ec4o)aP 
A A 


for all A € Ag, and the conclusion follows for example from the “averages 
lemma”. 

(4) The right-hand side of the equation in property (4) is Ap-measurable and 
its integral over A equals [,(aX + BY)dP for all A € Ap. By (6), it 
coincides a.s. with E(aX + BY|Ap). 


I.7. Conditional probability 465 


We show next that conditional expectations behave like “projections” in an 
appropriate sense. 

Theorem 1.7.5. Let Ag C A; be o-subalgebras of A. Then for all X € L\(P), 

E(E(X|Ao)|-A1) = E(E(X|A1)|Ao) = E(X|4o) as. (9) 


Proof. E(X|Ag) is Ap-measurable, hence it is also A,;-measurable (since Ag C 
A;), and therefore, by Theorem 1.7.4, Part 2, the far left and far right in (9) 
coincide a.s. 

Next, for all A € Ap(C A1), we have by (6) 


[ ee xAnioyar = [ex Anap = f xap =f Bcx|aqar 


so that the middle and far right expressions in (9) coincide a.s. 


“Almost sure” versions of the usual convergence theorems for integrals are 
valid for the conditional expectation. 


Theorem I.7.6 (Monotone convergence theorem for E(.|Ag)). Let 0 < 
X, <Xo<--- (as.) be r.v.’s such that lim X;, := X € L'(P). Then 


E(X|Ao) = lim E(X,|Ao) as. 
Proof. By Part (3) in Theorem I.7.4, 
OX BOA) SE OGA) Se ae 
Therefore, h := lim, E(X,|Ao) exists a.s., and is Ap-measurable (after being 


extended as 0 on some P-null set). By the usual monotone convergence theorem, 
we have for all A € Ag 


[nap =tim | B(Xq|Ao) 4P = isa f x,dP = | xap 
A i A neJA A 


hence h = E(X|Apo) a.s. 


Corollary I.7.7 (Beppo Levi theorem for conditional expectation). Let 
X,, > 0 (a.s.) be r.v.’s such that >, X, € L'(P). Then 


(Xl) = 2, F(%n|A0) as. 


Taking in particular X,, = Ig, with mutually disjoint B, € A, we obtain the 
a.s. o-additivity of P(.|Ao): 


P( lt) Br|-a = S > P(Bn|Ao) as. (10) 


466 Application I Probability 


Theorem I.7.8 (Dominated convergence theorem for conditional 
expectation). Let {X,,} be a sequence of r.v.’s such that 


Xn > xX as. 


and 
[Xel = ¥ ei (P). 


Then 
E(X|Apo) = lim E(X,|Ao) a.s. 


Proof. By Properties (1) and (3) in Theorem I.7.4, E(|E(X,|Ao)|) < E(Y) < 
oo, and therefore E(X,|Ao) is finite a.s., and similarly E(X|Ao). Hence 
E(X»p|Ao) — E(X|Apo) is well-defined and finite a.s., and has absolute value equal 
a.s. to 

E(Xp — X|Ao)| < E(|Xn — X||Ao) < B(Znl-Ao), (11) 


where 
Zn = sup |X, — X|(€ L'(P)). 
k>n 

Since Z, is a non-increasing sequence (with limit 0 a.s.), Property (3) in 
Theorem I.7.4 implies that E(Z,|Ao) is a non-increasing sequence a.s. Let h 
be its (a.s.) limit. After proper extension on a P-null set, h is a non-negative 
Ao-measurable function. Since 0 < Z, < 2Y € L'(P), the usual dominated 
convergence theorem gives 


o< fraps | B(Z|Ao)dP = f 2, dP rn 0, 
Q Q Q 


hence h = 0 a.s. By (11), this gives the conclusion of the theorem. 


Property (2) in Theorem I.7.4 means that Ap-measurable functions behave 
like constants relative to the operation E(.|Ao). This “constant-like” behavior is 
a special case of the following. 


Theorem 1.7.9. Let X,Y be r.v.’s such that X,Y, XY € Li(P). If X is 
Apop-measurable, then 

E(XY|Ap) =X E(Y|Ap) as. (12) 
Proof. If B € Ap and X = Ip, then for all A € Ag, 


[xe Y|Ao) dP = E(Y|Ao) dP = | YdP= [ xver, 
ANB 


ANB 


so that (12) is valid for Ao-measurable indicators, and by linearity, for all 
Apo-measurable simple functions. For an arbitrary Ap-measurable r.v. X € 
L'(P), there exists a sequence {X,,} of Ap-measurable simple functions such 
that X,, > X and |X,,| < |X|. We have 


E(XnY|Ao) = XnE(Y|Ao) as. 


I.7. Conditional probability 467 


Since E'(Y|Ao) is P-integrable, it is a.s. finite, and therefore the right-hand side 
converges a.s. to X E(Y|Ap). 

Since X,Y + XY and |X,Y| <|XY| € L1(P), the left-hand side converges 
a.s. to E(XY|Ao) by Theorem I.7.8, and the result follows. 


1.7.2 Conditioning by an r.v. 
1.7.10. Given an r.v. X, it induces a o-subalgebra Ax of A, where 


Ax := {X71(B); B € B}, 
and B is the Borel algebra of R (or C). It is then “natural” to define 
B(Y|X) := E(Y|Ax) (13) 


for any integrable r.v. Y. 
Thus E(Y |X) is the a.s. uniquely determined (Ax )-measurable function such 


that 
| E(Y|X)dP =i Y dP (14) 
X-1(B) X-1(B) 
for all BE B. 


As a function of B, the right-hand side of (14) is a real (or complex) 
measure on 8, absolutely continuous with respect to the probability measure 
Px [Px(B) = 0 means that P(X~1(B)) = 0, which implies that the right-hand 
side of (14) is zero]. By the Radon—Nikodym theorem, there exists a unique (up 
to Px-equivalence) Borel L1(Px)-function h such that 


[naps = | YdP (BEB). (15) 
B X-1(B) 
We shall denote (for X real valued) 


h(a) := E(Y|X =x) (reER), (16) 


and call this function the “conditional expectation of Y, given X = x”. Thus, by 
definition, 


| PV ea ae | YdP (BEB). (17) 
B X-1(B) 


Taking B = R in (17), we see that 
Eps (E(Y|X = «)) = E(Y), (18) 


where Ep, denotes the expectation operator on L'(R, B, Px). 


468 Application I Probability 


The proof of Theorem I.7.3 shows that E(Y|X = x) is Px-a.s. constant on 
each Px-atom B € B. By (17), we have 


E(Y|X =2)= Peay Meme (19) 


Px-a.s. on B, for each Py-atom B € B. 
As before, the “conditional probability of A € A given X = x” is defined by 
P(A|X =2):= E(I4|X =x) (xeER), 
or directly by (17) for the special case Y = I: 


i; P(A|X = 2)dPx(x) = P(AN X7'(B)) (BEB). (20) 
B 


If B € Bisa Px-atom, we have by (19) 


P(AN X71(B)) 


PUI = 2) =~ BX=t(B) 


= P(A|[X € B)) (21) 


Px-almost surely on B, where the right-hand side of (21) is the “elementary” 
conditional probability of the event A € A, given the event [X € B]. In 
particular, if B = {xz} is a Px-atom (i-e., if Px({x}) > 0; ie. if PLX = 2x] > 0), 
then 

P(A|X = 2) = P(A|[X =a]) (AE A), (22) 


so that the notation is “consistent”. 
The relation between the Ax-measurable function E(Y|X) and the Borel 
function h(x) := E(Y|X = 2) is stated next. 


Theorem I.7.11. Let (Q,A,P) be a probability space, and let X,Y be (real) 
r.v.’s, with Y integrable. Then, P-almost surely, 


E(Y|X) = h(X), 
where h(a) := E(Y|X = 2). 
Proof. By (14) and (15), 
‘| p(y|xyap = | hdPx (BEB). (23) 

X-1(B) B 
We claim that 

i. WiPee i: h(X)dP (BEB) (24) 

B X-1(B) 


for any real Borel Px-integrable function h on R. If h = Ig for C € B, (24) is 
valid, since 


) hdPx = Px(BNC) = P(X7~'(BNC)) 
B 


= POC UB) IX (e))= | Ix-1(c) dP = | h(X) dP. 
X-1(B) X-1(B) 


I.7. Conditional probability 469 


By linearity, (24) is then valid for simple Borel functions h. If h is a non-negative 
Borel function, let {h,,} be a sequence of simple Borel functions such that 0 < 
hy <ho <---, and limh, = h. Then {h,(X)} is a sequence of Ax-measurable 
functions such that 

0 < hy(X) <ho(X) <--- 


and lim h,(X) = h(X). By the monotone convergence theorem applied in the 
measure spaces (IR, B, Px) and (Q,.A, P), we have for all B € B: 


[naps =tim [ hy dPx = Tin f In(X)aP = f 
B TUS JB nm JX-1(B) xX-1 
For any real Py-integrable Borel function h, write h = h* — h7; then 
[naps = | nt aPy ~ [ h- dPx 
B B B 
= i, ht (X)dP — | h-(X)dP 
X-1(B) X-1(B) 


=} h(X)dP (BEB). 
~1(B) 


Thus (24) is verified, and by (23), we have 


aie )dP = [uc 


for all A € Ax (:= {X7~1!(B); B € B}). 
Since both ae are in L1(Q,Ax,P), it follows that they coincide 
P-almost surely. 


Theorem I.7.12. Let X,Y be (real) r.v.’s, with Y € L?(P). Then Z = E(Y|X) 
is the (real) Ax-measurable solution in L?(P) of the extremal problem 


|Y¥ — Z||2 = min. 
(Geometrically, Z is the orthogonal projection of Y onto L?(Q, Ax, P).) 
Proof. Write (for Y,Z € L?(P)): 
(Y —Z)? =[Y — E(Y|X))? + [E(V1X) — 2) 
2|E(Y|X) — ZIV — E(Y|X)]. (25) 


In particular, the third term is < (Y — Z)?. Similarly, we see that the negative 
of the third term is also < (Y — Z)?. Hence this term has absolute value < 
(Y — Z)? € Li(P). Since E(Y|X) € L1(P), the functions U := E(Y|X) — Z, 
V :=Y—E(Y|X), and UV are all in L1(P), and U is Ax-measurable whenever 
Z is. By Theorem I.7.9 with Ap = Ax, 


E(UV|X) := E(UV|Ax) =UE(V|Ax) =UE(V|X) 
= U[E(Y|X) — E(Y|X)] = 


470 Application I. Probability 
Hence by (25) 
E((Y — Z)?|X) = E(U?|X) + E(V?|X). 
Applying EF, we obtain 
E((Y — Z)’) = E(U*) + E(V*) > E(V*), 


that is, ||Y — Z|lo > ||Y — E(Y|X)|l2, with the minimum attained when U = 0 
(P-a.s.), that is, when Z = E(Y|X) as. 


Applying Theorem I.7.11, we obtain the following extremal property of h = 
E(Y|X =-). 


Corollary 1.7.13. Let X,Y be (real) r.v.’s, with Y € L?(P). Then the extremal 
problem for (real) Borel functions g on R with g(X) € L?(P) 
|¥ — g(X)|l2 = min 

has the solution 

g=h:=E(Y|X =-) as. 

Thus h(X) gives the best “mean square approximation” of Y by “functions 

of X”. The graph of the equation 

y = h(a)\¢= E(Y|X = 2) 
is called the regression curve of Y on X. 


1.7.14. Linear regression. We consider the extremal problem of 
Corollary 1.7.13 with the stronger restriction that g be linear. Thus we wish 
to find values of the real parameters a,b such that 


||Y — (aX + b)|lo = min, 


where X,Y are given non-degenerate L?(P)-r.v.’s. Necessarily, X,Y have finite 
expectations uz, and standard deviations 0, > 0, and we may define the so-called 
correlation coefficient of X and Y 


p= (X,Y) = — 
By 1.2.5, |p| < 1. 
We have 
IY — (aX + b)||5 
= E([Y — pe] — a[X — pa] + [ue — (apa + 8)])? 
= E(Y — pr)? + a E(X — pr)? + [ba — (apa + b)]? — 2aE((X — prr)(Y — p2)) 


= 05 + a7o7% + [m2 — (apr + b)]? — 2aporer 


= (a0, — pon)” + (1 — p*)o3 + [Ha — (apn +8)? = (1 — p*)o3, 


I.7. Conditional probability 471 


with equality (giving the minimal L?-distance o2\/1— p?) attained when 
ao, — poz = 0, bl — (ap +b) = 0, 
that is, when 
a= a" := po2/o1; b= BF := po — a* ph. 


In conclusion, the linear solution of our extremum problem (the so-called linear 
regression of Y on X) has the equation 


02 
y = M2 + p—(a— 1). 
O71 


Note that the minimal L?-distance vanishes iff |p| = 1; in that case Y = a*X + 
b* a.s. 


1.7.15. Conditional distribution; discrete case. 
Let X,Y be discrete real r.v’s, with respective ranges {z;} and {y,}. The 
vector-valued r.v. (X, Y) assumes the value (;, yx) with the positive probability 


Pik, Where 
Pik = 1, 
j,k 


The joint distribution function of (X,Y) is defined (in general, for any real r.v.’s) 
by 
F(a,y):=P|X <2,Y <y] (a,y eR). 


In discrete case 1.7.15, 


F(z, y) = S- Pjk- 


LiL YR<Y 


The marginal distributions are defined in general by 


Fx (a) := P[X < 2] = F(a,0); Fy(y):= PY <y] = F(x, y). 


In our case, 
Fy(e)= 0 95 Fry) = SS ve, 
Lye UR<Y 
where 
i= Dee Xe: ge) pe HPV Syl, 
k J 


Each singleton {z,;} is a Px-atom (because P|[X = xj] = p;. > 0). By (22) in 
Section 1.7.10 (with A = [Y = y;]), we have 


Pj : 
P(Y = y4|X = 2;) = a (j,k =1,2,...) 
Q- 


and similarly 


472 Application I Probability 


Note that 


SPW, ay) = 1, 
k 


and therefore the function of y given by 
Pox =o) 5) PV aux =e) =C a) >: be GT 2a) 
YR<Y YR<Y 
is a distribution function. It is called the conditional distribution of Y, given 


1.7.16. Conditional distribution; Continuous case. Consider now the case 
where the vector-valued r.v. (X,Y) has a (joint) density h (cf. Section I.5). Then 
the distribution function of (X,Y) is given by 


Faw= ff iencidade> (rae. 


By Tonelli’s theorem, the order of integration is irrelevant. At all continuity 
points (a, y) of h, one has h(x, y) = 0?F /Ox Oy. The marginal density functions 
are defined by 


hx (a) = f heway hy (y) = [ heads (x,y ER). 


These are densities for the distribution function F'y and Fy, respectively. 
If S := {(a,y) € R?;hx(x) = 0}(= hy ({0}) x R), then 


P[(X,Y) € S] = P[X € hy ({0})] = | hx (x) dx = 0, 
hx’ ({0}) 
so that S may be disregarded. On R? — S, define 
(x,y) 
h = : 26 
(vla) = Fae (26) 


This function is called the conditional distribution density of Y, given X = x. 
The terminology is motivated by the following. 


Proposition I.7.17. In the setting shown, we have Px-almost surely 
B(Y|X = 2) = f yh(uin)ay. 
R 


Proof. For all B € B(R), we have by Fubini’s theorem: 
fo vae= ff unleyacdy= f nx(o) f yhtule) aya 
X-1(B) BxR B R 


=| ([ urtote) av) arx(a, 


I.7. Conditional probability 473 


and the conclusion follows from (17) in Section 1.7.10. 


If h is continuous on R?, we also have the following. 


Proposition 1.7.18. Suppose the joint distribution density h of (X,Y) is 
continuous on R?. Then for allx € R for which hx (x) #0 and for all B € B(R), 
we have 


, h(y|c)dy = lim P(Y € Bla-—d<X <2x+0). 
B 60+ 
Proof. For 6 > 0, 


P([Y € B])N|a-d6<X <2a+)]) 
Pla-b0<X<a+4+0] 
_ Ses Sp hl (s, y) dy ds 
erp hx(s)ds 


PY EBlx-b<X<2#+06)= 


Divide both numerator and denominator by 26 and let 6 — 0. The continuity 
assumption implies that, for all 2 for which hx (x) 4 0, the last expression has 
the limit 

Sph 


(x,y) dy _ 
eee = | wale) dy 


It follows from (26) that /, h(y|x)dy = 1, so that h(y|x) (defined for all x 
such that hx (x) 4 0) is the density of a distnbution function: 


y 
F(y|x) := i, h(t|x) dt 
called the conditional distribution of Y given X = a. 


Example 1.7.19 (The binormal distribution). We say that X,Y are 
binormally distributed if they have the joint density function (called the binormal 
density) given by 


h(e,y) = (fees (- (==, HH), 


O71 02 


where Q is the positive definite quadratic form 


s* — 2pst + t? 
Qs, t) = — 


and 
c = 2m0;02\/1— p? 


(uz € R, op > 0, —1 < p< 1 are the parameters of the distribution). 


474 Application I Probability 


Note that 
Q(s,t) = [(s — pt)? + (1 — p?)t7]/[2(1 — p?)] = 0 


for all real s,t, with equality holding iff s = t = 0. Therefore h attains its 
absolute maximum 1/c at the unique point (x, y) = (41, [2). 

The sections of the surface z = h(x,y) with the planes z = a are empty for 
a > 1/c (and a < 0); a single point for a = 1/c; and ellipses for 0 < a < 1/c (the 
surface is “bell-shaped”). 

In order to calculate the integral f,.h(x,y)dady, we make the 
transformation 


T=pit+ois=pitoi(ut pt), y=p2+ ont, 


where u := s — pt. Then (u,t) ranges over R? when (x,y) does, and 
= 0102 > 0. 


Therefore, the previous integral is equal to 


(1/0) II on (WH 7)t)/(20-0)) gs gy dud 
R2 
= (1/\/2m(1 — p?)) i ee RG-#) du = 1, 
R 
since the last integral is that of the N(0, 1 — p?)-density. 
Thus h is indeed the density of a two-dimensional distribution function. 
Since Q(s,t) = s?/2 + (t — ps)?/2(1 — p”), we get (for x € R fixed, with 
§ = (x — f41)/o1 and t = (y — 2) /o2, so that dy = 0» dt): 


eo (Ha)? /207 


2 2 2 1 
ocd 2 yy ie? 72 i eo (tp8)?/201-P) 6. ap = 
x(a) = (1/eje*? f anit = 


Thus hx is the N(11,07)-density. By symmetry, the marginal density hy is the 
N(w2,03)-density. In particular, the meaning of the parameters ju, and o% has 
been clarified (as the expectations and variance of X and Y). 

We have (with s,t related to x,y as before and c’ = \/2m03(1 — p)): 


RG = a a ee 
hx (a) 1p) 


= (1/c') exp {—(t — ps)’/2(1 — p*)} 


= (1/¢)exp { ly = (ua + olerfonye— ws) \ 


Thus h(y|x) is the N(2 + p(o2/01)(a — p11), 03(1 — p”)) density. 


18. Series of L? random variables A75 


By Proposition 1.7.17, for all real x, 
E(Y |X = &) = pa + p(o2/01)(2 — p41), 


with an analogous formula for E(X|Y = y). 

Thus, for binormally distributed X,Y, the regression curves y = E(Y|X = 
z) and « = E(X|Y = y) coincide with the linear regression curves (cf. 
Section 1.7.14). They intersect at ({11, 2), and the coefficient p here coincides 
with the correlation coefficient p(X, Y) (cf. Section I.7.14). Indeed (with previous 
notations), 


XxX — Y- 
O1 


02 


= (1/c) hh (u+ ptytexp { ue ae ae du dt. 


The integrand splits as the sum of two terms. The term with the factor ut is 
odd in each variable; by Fubini’s theorem, its integral vanishes. The remaining 
integral is 


eu ?/2(1—p? du f Be" at =p 
R 


Ta hh Ta 


Note in particular that if the binormally distributed r.v.s X,Y are uncorrelated, 
then p = p(X, Y) = 0, and therefore h(x, y) = hx(x)hy (y). By Proposition I.5.1, 
it follows that X,Y are independent. Since the converse is generally true 
(cf. Section 1.2.5), we have the following. 


Proposition I.7.20. [f the r.v.’s X,Y are binormally distributed, then they are 
independent iff they are uncorrelated. 


1.8 Series of L? random variables 


This section considers the a.s. convergence of series of independent r.v’’s. 
We fix the following notation: {X;,}?2, is a sequence of real independent 
central L?(P) random variables; for n = 1,2,..., we let 


and 


Lemma I.8.1 (Kolmogorov). For each ¢ > 0 andn =1,2,... 


y] 


PUS Sie 8 fe, 


476 Application I Probability 


Proof. Write 


[Tr =] = [Si] 2 JU {[]52] =] O[1Sal < J} U--- 
U{[lSn] > [Sn] < ek =1,...,n—I]}, 


and denote the Ath set in this union by Ax. 
By independence and centrality of Xx, 


n k 
2 2 2 
S2dP=)5 | X?dP>)5 | X? dP 
jai? Ak jail? Ak 


~| S2aP>e2P(A,). 
Ax 


Ak 


Since the sets A, are mutually disjoint, we get 


P[Tn ><)= 5 €P(Ar) <5) | S2dP 
k=1 k=l? Ak 


S i: Sede =< ES) = 
[Tn >€] 


Theorem 1.8.2. (For X; as previously.) If S>,0% < oo, then So, Xx 
CONVETGES 1.8. 


Proof. Fix « > 0. For n,m € N, denote 


m+k 
Anm [max | pS X;| > € 
= jg=mt1 
and 
mt+k 
Am =[ sup | oa X;| > €] 
1<k<oo je 


Then A,,, is the union of the increasing sequence {Anm}n, so that 
P(Am) = lim P(Anm). 


By Lemma I.8.1, 


m+n 


P(Anm) < (1/e?) S- gr, Si se) 3 ae, 


k=m+1 k=m+1 


Hence, for all m 


P(Am) < (1/e?) s pa 


k=m+1 


18. Series of L? random variables AT7 


and therefore 


m+k 
inf s Bf Sg eit 
Pa eon p> X;| > e] < (1/e”) oS oF 
j=m+1 k=m+4+1 


The right-hand side tends to zero when m —> oo (by hypothesis), and therefore 
the left-hand side equals 0. Thus, 


m+k 
inf sup | ) X;|<e as., 
m 
k j=mt1 


hence, there exists m, such that for all k, one has | )7""* X3s|-< 2€ (as8:): 


j= a 
For an a.s. bounded sequence of r.v.’s, a converse result is: 

Theorem I.8.3. Let {X;,} be a sequence of independent central r.v.’s such that 
(i) |Xz| <c¢ a.s.; and 
(it) PIT, Xk converges | > 0. 

Then >>, 0% < 0. 


Proof. By (i), |Sn| < nce as. 

Let A be the set on which {S,,} converges. Since PA > 0 by (ii), it follows 
from Theorem 1.57 that {.S,,} converges uniformly on some measurable subset 
Bc Awith PB > 0. Hence |S;,,| < d for all n on some measurable subset EC B 
with PE > 0. Let 


Ey, =([|Sk] << d1<k<n] (neN). 
The sequence {E,,} is decreasing, with intersection E. Let apg = 0 and 
=| S2dP (neéN). 
En 


Write 
Fn = Ln—-1 — E,(C En—1); En = Ln-1 — Fn, 


An — An-1 =| 
E 


Since X,, and S,_; are central and independent, we have by BienAyme’s identity 


I, 


so that 


stap— f spap— f Sr yaP. 
Fy EB 


n-1 nm—1 


stap = | xzap+ | Soar: 
E 


n—-1 n-1 


On = Oni = f xzap— | S? dP. 
En-1 Fy 


and therefore 


478 Application I Probability 


On F,,(C En-1), we have 
|[Sn| <|Xn|+|Sn-1]<e+d as. 
Therefore 


n— On—1 aL X? dP — (c+d)*P(F,). (1) 


Since Ig,_, (which is defined exclusively by means of X1,...,Xp»—1) and X? are 
independent, it follows from Theorem I.2.2 that 


[Xap = BUp,.X2) = PEn1)0*(Xn) > P(E)o 


n° 


Hence, by (1), and summing all the inequalities for n = 1,...,k, we obtain 
k k 
a, > P(E)" 02 — (c+d)? S° P(F) 
n=1 n=1 


hence, 


so that )>,, 02 < oo. 
We consider next the non-central case. 


Theorem 1.8.4. Let {X;,} be a sequence of independent r.v.’s, such that 
|X;,| <c,k = 1,2,... as. Then )°, X_, converges a.s. iff the two (numerical) 
series )\, E(X;) and >>), 0% converge. 


Proof. Suppose the two “numerical” series converge (for this part of the proof, 
the hypothesis |X;,| < c a.s. is not needed, and X; are only assumed in L?, our 
standing hypothesis). Let Y, = X; — E(X;,). Then Y; are independent central 
L? random variables, and 5>,,07(Yx) = 5,07(Xz) < co. By Theorem 1.8.2, 
>=, Ye converges a.s., and therefore, 5°, Xx = >°,[Yx + E(Xx)] converges a.s., 
since )> E(X;,) converges by hypothesis. 

Conversely, suppose that 5~> X;, converges a.s. 


18. Series of L? random variables 479 


Define on the product probability space 
(Q,A, P) x (Q,.A, P) 
the random variables 
Zn (W1, Wo) = Xn (wi) — Xn (wa). 


Then Z,, are independent. They are central, since 
E(Zq) =f [Xn(w) ~ Xu(wa)] a(P x P) 
QxK 
= | Xy(w1)dP(wi) — | Xn(we)dP(w2) = E(X,) — E(X,) = 0. 
Q Q 


Also |Z,,| < 2c. 
Furthermore, 5> Z,, converges almost surely, because 


{ (w1,w2) E€Qx Oo Zn diverges } 


i { (w1,w2); + X,,(w1) diverges} U {(w1, we); Se Xn (we) diverges }, 


and both sets in the union have P x P-measure zero (by our a.s. convergence 
hypothesis on }> X,,). 
By Theorem 1.8.3, it follows that S> 0?(Z,) < co. However, 


a*(Zn) =f [Xn(wn) — Xnlun))?alP x P). 
xa 
Expanding the square and integrating, we see that 
o° (Zn) = 2[E(X7) — E(Xn)] = 207(Xn). 
Therefore, S>o?(Xn) < oo, and since Y, are central, and S>o7(Y,) = 


So a?(X,) < c, we conclude from Theorem I.8.2 that 5>Y, converges a.s.; 
but then )> E(X,) = )>(X»n — Yn) converges as well. 


We consider finally the general case of a series of independent L? random 
variables. 


Theorem I.8.5 (Kolmogorov’s “three series theorem”). Let {X,} be a 
sequence of real independent L?(P) random variables. For any real k > 0 and 
n EN, denote 

En := ||Xn| < ky]; X) = 15, Xn. 


480 Application I. Probability 


Then the series )> X, converges a.s. iff the following numerical series (a), 
(b), and (c) converge: 


(a) Yin P(E); 
(b) BOG): 
(oh wt (ee 
Proof. Consider the “truncated” r.v.’s 
Vis = Ip, Xn+ kI pe, 


and Y, defined similarly with —k instead of k. 

If }> X,(w) converges, then X,(w) — 0, so that w € E, for n > no, and 
therefore, Y,*(w) = Y," (w) = Xn(w) for all n > no; hence, both series > Y,* (w) 
and 5~ Y, (w) converge. Conversely, if one of these two series converge (say the 
first), then Y,;"(w) > 0, so that Y,"(w) # k for n > no, hence, necessarily 
Y,* (w) = Xn(w) for n > no, and so )> X;,(w) converges. 

We showed that 5> X,,(w) converges iff the series of Y,* and Y,~ both converge 
at w; therefore, 5> X,, converges a.s. iff both Y-series converge a.s. Since Y,* and 
Y,, satisfy the hypothesis of Theorem I.8.4, we conclude from that theorem 
that 5> X, converges a.s. iff the numerical series )> E(Y,*), S>0?(Y,*), and 
the corresponding series for Y, converge. It then remains to show that the 
convergence of these four series is equivalent to the convergence of the three 
series (a)—(c). 

Since 

B(Y;t) = B(X!,) + kP(ES) 


(and a similar formula for Y,~), we see by addition and subtraction that the 
convergence of the series (a) and (b) is equivalent to the convergence of the two 


series 
SEC Seb): 
Next 
07 (¥) = EU(Y)?} — {EOP 
= E{(X;,)?} + k° P(E) — [E(X),) + kP(ER))? 
= 0?(X!,) + k?P(E,)P(ES) — 2kE(X',) P(ES), (2) 


and similarly for Y,- (with —k replacing k). 

If the series (a)-(c) converge, then we already know that >> E(Y,") and 
>> E(Y,,) converge. The convergence of (b) also implies that |E(X/,)| < M 
for all n, and therefore, 


|E(X,) P(E;,)| < MP(E;), 


so that 5> E(X/,) P(ES) converges (by convergence of (a)). 
Since 0 < P(E,,)P(ES) < P(ES), the series 5) P(E,,)P(ES) converges (by 
convergence of (a)). 


I.9. Infinite divisibility 481 


Relation (2) and the convergence of (c) imply therefore that 5>0?(Y,*) 
converges (and similarly for Y,-), as wanted. 

Conversely, if the “four series” mentioned converge, we saw already that the 
series (a) and (b) converge, and this in turn implies the convergence of the series 


S> B(X),) P(E) and > P(En)P(ES). 


By Relation (2), the convergence of > a?(Y,") implies therefore the convergence 
of the series (c) as well. 


I.9 Infinite divisibility 


Definition I.9.1. A random variable X (or its distribution function F,, or its 
characteristic function f) is infinitely divisible (i.d.) if, for each n € N, F is the 
distribution function of a sum of n independent r.v.’s with the same distribution 
function Fy. 

Equivalently, X is i.d. if there exists a triangular array of r.v.’s 


{Xngsl <k< n,n =1,2,...} (1) 


such that, for each n, the r.v’s of the nth row are independent and equi- 
distributed, and their sum T,, =a X (if X,Y are r.v’s, we write X =q Y when 
they have the same distribution). 

By the uniqueness theorem for ch.f’s, X is i.d. iff, for each n, there exists a 
ch.f. fy such that 


Li (2) 


In terms of distribution functions, infinite divisibility of F' means the 
existence, for each n, of a distribution function F;,, such that 


F= FM”, (3) 


where G(”) := Gx--.* G (n times) for any distribution function G. The 
convolution F' * G of two distribution functions is defined by 


(F * G)(x) = [ Fle- acy) (x € R). 


It is clearly a distribution function. An application of Fubini’s theorem shows 
that its ch.f. is precisely fg (where f,g are the ch.f’s of F,G, respectively). It 
then follows from the uniqueness theorem for ch.f’s (or directly!) that convolution 
of distribution functions is commutative and associative. We may then omit 
parenthesis and write F, * Fy *--- * F, for the convolution of finitely many 
distribution functions. In particular, the repeated convolutions G(”) mentioned 
earlier make sense, and criterion (3) is clearly equivalent to (2). 


482 Application I Probability 


Example I.9.2. 


(1) The Poisson distribution is i.d.: take F,, to be Poisson with parameter A/n 
(where 2 is the parameter of F’). Then indeed (cf. Section 1.3.9): 


f2(u) = [eC/ME*—D] = flu). 


(2) The normal distribution (parameters pz,07) is id: take F, to be the 
normal distribution with parameters ju/n,a?/n. Then (cf. Section 1.3.12) 


fa (u) = [elmer 22)" = Fu), 


(3) The Gamma distribution (parameters p,b) is id.: take F,, to be the 
Gamma distribution with parameters p/n, b. Then (cf. Section 1.3.15): 


flu) = (: “ en 25@ 


It is also clear that the Cauchy distribution is i.d. (cf. Section 1.3.14), while 
the Laplace distribution is not. 

We have the following criterion for infinite divisibility (its necessity is obvious, 
by (1); we omit the proof of its sufficiency). 


Theorem I.9.3. A random variable X is i.d. iff there exists a triangular array 
(1) such that (cf. Definition 1.4.3) 


eae Ke, (4) 


Some elementary properties of i.d. random variables (or ch.f’s) are stated in 
the next theorem. 
Theorem [.9.4. 

(a) If X is i.d., sois Y :=a+bxX. 

(b) If f,g are i.d. characteristic functions, so is fg. 

(c) If f is an i.d. characteristic function, so are f and |f|?. 

(d) If f is an i.d. characteristic function, then f #4 0 everywhere. 


(e) If {fe} is a sequence of i.d. characteristic functions converging pointwise 
to a function g continuous at 0, then g is an i.d. ch.f. 


(f) If f is an i.d. characteristic function, then its representation (2) is unique 
(for each n). 


Proof. 
(a) By Proposition 1.3.5 and (2), for alln EN, 
fy (u) =e f(bu) = [el™*/" fa (bu)]” = gh, 


where g,, is clearly a ch-f. 


I.9. Infinite divisibility 483 


(b) Represent f,g as in (2), for each n. Then fg = [fngn]”. By Corollary 1.2.4, 


fg and fngn are ch.f’s, and they satisfy (2) as needed. 


(c) First, f is a ch-f., since 


— 


eS 


flu) = [ elu dF (x) = [ eM dF (2) 
= fovea — F(a) = fe aF_x(e) = Fx(w) 


If f is id., then by (2), f = (fn)", where f, is a ch.f., as needed. The 
conclusion about | f|? follows then from (b). 


Since it suffices to prove that |f|? 4 0, and | f|? is a non-negative i.d. ch-f. 
(by (c)), we may assume without loss of generality that f > 0. Let then 
fn = f'/" be the unique non-negative nth root of f. Then g := lim fy, 
(pointwise) exists: 

g=Ig, E=f-*(R*). 


Since f(0) = 1, the point 0 belongs to the open set EL, and so g = 1 
in a neighborhood of 0; in particular, g is continuous at 0. By the Paul 
Levy continuity theorem (Theorem 1.4.8), g is a ch.f., and is therefore 
continuous everywhere. In particular, its range is connected; since it is 
a subset of {0,1} containing 1, it must be precisely {1}, that is, g = 1 
everywhere. This means that f > 0 everywhere, as claimed. 


By the Paul Levy continuity theorem, g is a ch.f. By (d), f, 4 0 everywhere 
(for each k), and has therefore a continuous logarithm log f,, uniquely 
determined by the condition log f,(0) = 0. Since fi, isid., fy = ff,, with 
fren ch.f?s (by (2)). We have 


e(t/n) log fr — (l/r) log fin = fea 


The left-hand side converges pointwise (as k — 00) to e@/™Mlesg := g,,. 
Since g is a ch.f., g(0) = 1, and therefore log g is continuous at 0, and the 
same is true of g»(=limg fzn). By Paul Levy’s theorem, gy, is a ch.f., and 
clearly g? = g. Hence g is i.d. 


(f) Fix n, and suppose g,h are ch.f’s such that 


aay " 


By (d), 2 4 0 everywhere, and therefore g/h is continuous, and (g/h)” = 1 
everywhere. The continuity implies that g/h has a connected range, which is a 
subset of the finite set of nth roots of unity. Since g(0) = h(0) = 1 (these are 
ch.f’s!), the range contains 1, and coincides therefore with the singleton {1}, 


that is, g/h = 1 identically. 


484 Application I Probability 


By Example I.9.2(1) and Theorem I.9.4 (Part (a)), if Y = a+ 0X with X 
Poisson-distributed, then Y is i.d. We call the distribution Fy of such an r.v. Y 
a Poisson-type distribution. By Proposition 1.3.5 and Section 1.3.9, 


+ iub 
fr (u) = elt), (5) 


The Poisson-type distributions “generate” all the i.d. distributions in the 
following sense (compare with Theorem I.9.3). 


Theorem 1.9.5. A random variable X is infinitely divisible iff there exists an 
array 
{Xngjl <k <r(n),n=1,2,...} 


such that, for each n, the r.v.’s in the nth row are independent Poisson-type, and 


Proof. Sufficiency. As we just observed, each Xn, is i.d., hence T,, are i.d. by 
Theorem I[.9.4, Part (b), and therefore X is i.d. by Part (e) of Theorem 1.9.4 and 
Corollary 1.4.6. 

Necessity. Let X be i.d. By (2), there exist ch.f’s f, such that f := fx = 
f?,n=1,2,.... By Theorem 1.9.4, Part (f) and the proof of Part (e), the f,, are 
uniquely determined, and can be written as 


fr = e(l/n) log f 


where log f is continuous and uniquely determined by the condition log f (0) = 0. 
Fix u € R. Then 


nf fin(u) — 1] = nfeO/ 8S) — 1] = n[(1/n) log f(u) + o(1/n)] 
Jn00 log f(u), 


that is, if F,, denotes the distribution function with ch.f. f,, then 


log f(u) = lim n [ie — 1) dF, (x). (6) 
isi R 
For each n, let m = m(n) be such that 
1— F,(m) + Fr( Wee : 
a ae 2n? 


Then for all u€ R, 


| f (ewe - 1) arate) - f (ee —1) dra (a)| < 4. 


—m(n) 


I.10. More on sequences of random variables 485 


Approximate the Stieltjes integral over [—m(n),m(n)] by Riemann-Stieltjes 
sums, such that 


m(n) 
|| ( C= Die = DlFa (te) — Fa(te—1)] 2 = 


where t% = x(n) = —m(n) + 2m(n)k/r(n), k = 1,...,r(n). By (6), f is the 
pointwise limit (as n > co) of the products 


r(n) 
[] expe (e#*” — D}, (7) 
k=1 


where Ang = 2[Fin(ae) — Fn(ep—1)] and ang = UR (= Lp (n)). 
The products in (7) are the ch.f’s of sums of r(n) independent Poisson-type 
r.v’s. 


I.10 More on sequences of random variables 


Let {X,,} be a sequence of (complex) random variables on the probability space 
(OQ,A, P). For c > 0, set 


m(c) := sup |X,|dP. 
[|Xn|2e] 


Definition 1.10.1. {X,,} is uniformly integrable (u.i.) if 


im m(c) = 0. 


For example, if |X;,| < g € L1(P) for all n, then [|X,| > ¢] C [g > c], so that 


m(c) < ‘ gdP > 0 
[g2¢] 


as C > 00. 
A less trivial example is given in the following. 


Proposition I.10.2. Let A, be sub-c-algebras of A, let Y € L'(P), and 
> aN cha Oe ee ee ee 
Then Xp are u.i. 


Proof. Since |X,| < E(|Y||An), and X, are A,-measurable (so that Ap, := 
(|Xn] = e] € An), 


[ ixlaps f xqvisnyap= f iyiar. 
An An Ax 


486 Application I Probability 


We have 
P(An) < E(\Xnl)/e < E(/Y|)/c. 


Given € > 0, choose K > 0 such that Ji )|¥|dP <e. Then 


|Y|> kK 


/ wap =( +f )iviar 
An Annl¥I<K] J AnAll¥|>5] 


<KP(An) +e< KY ]li/e+e, 


hence 
m(c) < K||Y||1/ce+e. 


Since € was arbitrary, it follows that lim, ... m(c) = 0. 


Theorem I.10.3. {X,} is u.t. iff 
(1) sup, ||Xn|la < co (“norm-boundedness”) and 
(2) sup, J, |Xn|dP + 0 when PA > 0 (“uniform absolute continuity”). 


Proof. If the sequence is u.i., let 


M :=sup m(c) (< oo). 
c>0 


Then for any c > 0, 


oh=(f tf \iKaldP set m(o) seta, 
[|Xn|<e] [|Xn|2e] 


so that (1) is valid. 
Also, for any A € A, 


| \X,|aP = « +/ ) Xn] aP < PA +r(o). 
A AN[| Xn |<c] AN[|Xn|>c] 


Given € > 0, fix c > 0 such that m(c) < ¢/2. For this c, if PA < €/2c, then 
J \|Xn|@P < € for all n, proving (2). 
Conversely, if (1) and (2) hold, then by (1), for all n 


PIX 24 < f 


|X| dP <sup |Xnlla = R<oo, 
[|Xn|2e] n 


so that 

P[|Xn| 2 ¢] < R/c. 
Given € > 0, there exists 6 > 0 such that [,|X;n|dP < e for all n whenever 
AéAhas PA < 6 (by (2)). Therefore, if c > R/d, surely P[|X,| > c] < 6, and 
consequently m(c) < e. 


Fatou’s lemma extends as follows to u.i. real r.v.’s: 


I.10. More on sequences of random variables 487 


Theorem [.10.4. Let X, be u.i. real r.v.’s. Then 
E(liminf X,,) < liminf E(X,) < limsup E(X,,) < E(limsup X,.). 


Proof. Since [X, < —c] C [|Xn| > c], we have 


i Xn ap| < m(c) 
[Xn<—c] 


for any c > 0. Given € > 0, we may fix c > 0 such that m(c) < € (since Xy, 
are u.i.). 

Denote A, = [X, > —c]. 

We apply Fatou’s lemma to the non-negative measurable functions c+ X,Ja,: 


E(c+liminf X,J,,) < liminf E(c+ X,IL,,), 
hence, 
E(lim inf X,J,,,) < liminf E(X,La, ). (*) 
However, 


E(X,14,) = E(Xn)— | XndP < E(Xn) +6, 
AS 


and therefore the right-hand side of (*) is 
< liminf E(X,,) +. 


Since X,J4, > Xn, the left-hand side of (*) is > E(liminf X,,), and the left 
inequality of the theorem follows. The right inequality is then obtained by 
replacing X, by —Xy. 


Corollary 1.10.5. Let X, be u.t. (complex) r.v.’s, such that X, > X a.s. or in 
probability. Then X, — X in L* (and in particular, E(Xn) > E(X)). 


Proof. Let K := sup,, || Xn||1(< co, by Theorem I.10.3(1)). By Fatou’s lemma, 
if X, > X as., then 
|X|} <limin£ [Xpll < K, 


so X € L' and 
sup ||X, — X||1 < 2K. 


Also, for A € A, again by Fatou’s lemma, 
i [Xp X dex limjnt Ree aie 
A ULE 


hence, 


sup | Xn — X]dP < 2sup [ |X,|dP > 0 
n A n A 


as PA — 0, by Theorem I.10.3(2). 


488 Application I Probability 


Consequently (by the same theorem) |X,,—X| are u.i., and since |X,—X| > 0 
a.s., Theorem I.10.4 applied to |X,, — X| shows that ||X, — X||1 > 0. 

In case X,, — X in probability, there exists a subsequence X,, converging 
a.s. to X. By the first part of the proof, X € L1, and ||Xn, — X||1 > 0. The 
previous argument with m = n, shows that |X,, — X| are u.i. Therefore any 
subsequence of X,, has a subsequence converging to X in L+. If we assume that 
{X,,} itself does not converge to X in L1, then given € > 0, there exists a 
subsequence X,,, such that || Xn, — X||1 > € for all k, a contradiction. 


Definition 1.10.6. A submartingale is a sequence of ordered pairs (X,,An), 
where 

(1) {An} is an increasing sequence of sub-c-algebras of A; 

(2) the real rv. X,, € L* is A,-measurable (n = 1,2,...); and 

(3) Xn < E(Xn41|An) as. (n = 1,2,...). 

If equality holds in (3), the sequence is called a martingale. The sequence is 
a supermartingale if the inequality (3) is reversed. 

By definition of conditional expectation, a sequence (X,,,A,) satisfying (1) 
and (2) is a submartingale iff 


[ Xeaps | XusaP (AE Ap). 
A A 


For example, if A, are as in (1) and Y is an L}- r.v., then setting 
Xy:= E(Y|A,), n=1,2,..., 


the sequence (X,,An) is a martingale: indeed (2) is clear by definition, and 
equality in (3) follows from Theorem I.7.5. 
An important example is given in the following. 


Proposition 1.10.7. Let {Y;,} be a sequence of L*, central, independent r.v.’s. 
Let A, be the smallest o-algebra for which Y,,...,Yn are measurable, and let 
Xn =Y¥it-:-+Yy. Then (Xn,An) is a martingale. 


Proof. The requirements (1) and (2) are clear. 
If Yni1 = Ip with B € A independent of all A € An, then for all A € An, 


,, Yn41dP = P(AN B) = P(A)P(B) = iy E(¥n41) aP, 
A A 


and this identity between the extreme terms remains true, by linearity, for all 
simple r.v’s Yn41 independent of Y,...,Y,. By monotone convergence, the 
identity is true for all Y,,41 > 0, and finally for all L!-r.v’s Y,41 independent of 
Y1,..-, Yn. Therefore, 


E(Yn+i|An) = E(Yn+1) a-s., 


1.10. More on sequences of random variables 489 


and in the central case, 
E(¥n41|An) =0 as. 


Since X, is A,-measurable, 


E(Xp|An) = Xn as. 


Adding these equations, we obtain E(Xn4i1|An) = Xn a.s. 


Proposition 1.10.8. If (Xn,An) is a submartingale, and g : R > R is a 
convex, increasing function such that g(X,) € L* for all n, then (g(Xn),An) is 
a submartingale. If (Xn,An) is a martingale, the preceding conclusion is valid 
without assuming that g is increasing. 


Proof. Since g is increasing, and X, < E(Xn41|An) a.s., we have a.s. 
g(Xn) < gG(E(Xn41|An))- 


By Jensen’s inequality for the convex function g, the right-hand side is 
< E(g(Xn41)|An) a.s., proving the first statement. In the martingale case, since 
Xy = E(Xn41|An), we get a.s. 


Xn) = HE(Xnti]An)) S$ E(g(Xn41)|An). 


We omit the proof of the following important theorem. 


Theorem I.10.9 (submartingale convergence theorem). If (Xn,An) is a 
submartingale such that 
sup E(X;) < 0, 


then there exists an L'-r.v. X such that X, > X as. 


By the proposition following Definition I.10.1 and the comments following 
Definition 1.10.6, if A, are increasing sub-c-algebras, Y is an L!-r.v., and Xy := 
E(Y|A,), then (X,,A,) is a u.i. martingale. The converse is also true: 


Theorem 1.10.10. (X;,An) is a u.i. martingale iff there exists an L1-r.v. Y 
such that X, = E(Y|A,) (a.s.) for all n. When this is the case, X, > Y = 
E(Y|A.) a.s. and in L', where A is the c-algebra generated by the algebra 


Proof. We just observed that if X, = E(Y|A,), then (X,,An) is a wi. 
martingale. Conversely, let (X,,A,) be a u.i. martingale. By Theorem 1.10.3 
(1), sup,, || Xn||1 < oo, hence by Theorem 1.10.9, there exists an L'-r.v. Y such 
that X, — Y as. By Corollary 1.10.5, X, — Y in L! as well. Hence, for all 
A€é A, and m>n, 


[ xap= | XnaPm f var =f E(Y|An) dP 
A A A A 


and it follows that X, = E(Y|An) as. 

For any Borel set B C R, we have X71(B) € An C Ax. Hence X», is 
A-measurable for all n. If we give Y some arbitrary value on the null set where 
it is not determined, Y is A,, measurable, and therefore E(Y|A..) =Y. 


Application II 


Distributions 


II.1 Preliminaries 


II.1.1. Let Q be an open subset of R”. For 0 < k < oo, let C*(Q) denote the 
space of all complex functions on Q with continuous (mixed) partial derivatives 
of order < k. The intersection of all these spaces is denoted by C°(Q); it is 
the space of all (complex) functions with continuous partial derivatives of all 
orders in Q. For 0 < k < 00, C*(Q) stands for the space of all f € C*(Q) with 
compact support (in 2). We shall also use the notation C*(A) for the space of 
all functions in C*(R”) with compact support in the arbitrary set A C R”. (The 
latter notation is consistent with the preceding one when A is open, since in 
that case any C*-function in A with compact support in A extends trivially to 
a C*-function on R”.) 


Define f : R > [0,00) by f(t) = 0 for t > 0, and f(t) = e!/* for t < 0. Then 
f ¢ C®™(R), and therefore, with a suitable choice of the constant y, the function 
o(x) := yf (|x|? — 1) on R™ has the following properties: 
(1) ge CH(R"); 

(2) supp ¢ = {x € R"; |x| < 1}; and 

(3) ¢>0and f ¢ddz =1. 
(For x € R”,|z| denotes the usual Euclidean norm; {-dzx denotes integration 
over R” with respect to the n-dimensional Lebesgue measure dz.) 

In the following, ¢ denotes any fixed function with Properties (1)-(3). 


If u : R” — C is locally integrable, then for any r > 0, we consider its 
regularization 


ur(x) = f we-ryow) ay =r” f uly)o (=) dy, (1) 


(i.e., u, is the convolution of u with ¢, := r~"¢(-/r); note that the subscript r 
has different meanings when assigned to u and to ¢.) 


492 Application II Distributions 


Theorem II.1.2. Let K be a compact subset of (the open set) QC R”, and let 
u € Li(R") vanish outside K. Then 


(1) up, € CX(Q) for all r < 6 := dist(K, Q°); 


(2) lim;p49 Up = u in LP-norm if u € L?(1 < p < ov), and uniformly if u is 
continuous. 


Proof. Since u, = u * ¢,, one sees easily that (mixed) differentiation of any 
order can be performed on u, by performing the operation on ¢,; the resulting 
convolution is clearly continuous. Hence u, € C%(Q). 

Let 

Ky, := {x € R"; dist(a, Kk) <r}. (2) 

It is a compact set, contained in 2 for r < 6. If y is in the closed unit ball S of 
R” and «—ry € K, then dist(z, K) < |x — (a —ry)| = |ry| <r, that is, c € K,. 
Hence, for « ¢ K,, x—ry € K for all y € S. Since u and ¢ vanish outside 
K and S, respectively, it follows from (1) that u, = 0 outside K,.. Therefore 
supp u, C K, CO (and so u, € CS(Q)) for r < 6. 

By Property (3) of ¢, 


tin(a) — tae) = i, fu(ae — ry) — u(2)]d(0) ay. 


If u is continuous (hence uniformly continuous, since its support is in A’), then 
for given € > 0, there exists 7 > 0 such that |u(a— ry) —u(x)| < € for all  € R” 
and all y € S if r <n. Hence ||uy — ulloo < € for r < 7. 
In case u € L?, we have 
[[rllp = lu * brllp S [ellpllorlla = llullp- 
Fix v € C.(Q) such that ||u—v]|, < € (by density of C.(/c) in L?(K)). Let M be 
a bound for the (Lebesgue) measure of (supp v), for all r < 1. Then for r < 1, 


llur = Ullp S [= e)ellp + [ler — ellp + [lv — ullp < 2¢ + |lvr — vllooM"/? < 3¢ 


(by the preceding case), for 7 small enough. 


The inequality 


If * all> < Wfllligll, (f € LP39 € L*) (3) 
used in the preceding proof, can be verified as follows: 
If ap =supq] f(F*g)edelsne £4, |iy <1} 


5p ‘| Fle — y)llg(y)| dylh(a)| dex 
h 

ae / Lf(e — y)| |h(@)| delg(y)| dy 
h 


ss sup | IFC — WllrllAllala(y)| dy = sup [lf llpllAllaliglla = Ilfllpligila. 
h h 


II.2. Distributions 493 


where the suprema are taken over all h in the unit ball of L%. (We used 
Theorems 4.6, 2.18, and 1.33, and the translation invariance of Lebesgue 
measure. ) 


Corollary IT.1.3. 


(1) CS (Q) is dense in L?(Q) (1 <p < oo). 


(2) A regular complex Borel measure 4 on Q is uniquely determined by the 
integrals J, fdu with f € Co(Q). 


Corollary IJ.1.4. Let K be a compact subset of the open set Q C R”. Then 
there exists wb € CS°(Q) such that 0 < w <1 and w =1 in a neighborhood of K. 


Proof. Let 6 be as in Theorem II.1.2, r < 6/3, and y = u,, where wu is the 
indicator of Ko, (cf. (2)). Then supp 7 C (Kor)r = K3r CQ, by € CX(Q),0 < 
yw <1, and w=1o0n K,. 


Corollary II.1.4 is the special case k = 1 of the following. 


Theorem II.1.5 (Partitions of unity in C°(Q)). Let 4,...,Qx be an open 
covering of the compact set K in R”. Then there exist @; € Co°(Q;) (7 =1,...,k) 
such that @; > 0, Sr oj <1, and > ¢; =1 in a neighborhood of K. 


The set {@;} is called a partition of unity subordinate to the covering {Q,;}. 


Proof. There exist open sets with compact closures K; C Q; such that K C 
U; K;. Let ; be associated with AK; and Q; as in Corollary I.1.4. Define ¢, = 7 
and $; = ~;(1-j-1).-. (1-71) for j = 2,..., k (as in the proof of Theorem 3.3). 


Then 
4 =1-[] vs, 
j j 


from which we read off the desired properties of @;. 


II.2 Distributions 


TI.2.1. Topology on CS(Q). 
Let D; := —i0/Ox; (j = 1,...,n). For any “multi-index” a = (a1,...,Qn) 
(oy 0, 1,2 )21u) and @= (2457.25) ER", set 


D® := D%... D8, 

a Qa An 
al :=ay!--+ an}, la] = ) aj, and £° := aft ---anm. 
We denote also 
olel f 


Ort ++ Oxn” 


for any function f for which the shown derivative makes sense. 


fio) 


494 Application II Distributions 


Let K be a compact subset of the open set Q Cc R”. The sequence of 
semi-norms on C'S°(K) 


lial := S> sup|D%4|, k=0,1,..., 


lal<k 


induces a (locally convex) topology on the vector space C'S (Kr): it is the weakest 
topology on the space for which all these semi-norms are continuous. Basic 
neighborhoods of 0 in this topology have the form 


{¢ € CE(K); Ilolle < ef, (1) 


withe >O and k=0,1,.... 

A sequence {¢;} converges to ¢ in CO°(K) iff D°¢; > D%¢ uniformly, for 
all a. 

A linear functional u on CS° (Ac) is continuous iff there exist a constant C > 0 
and a non-negative integer k such that 


lu()| < Cllolle (PE CR(K)) (2) 


(cf. Theorem 4.2). 

Let {92} be a sequence of open subsets of 2, with union 2, such that, for 
each j,Q; has compact closure K; contained in Q;41. 

Since C3°(Q) = U; C°(Kj), we may topologize C?°(() in a natural way so 
that any linear functional u on CS°(Q) is continuous iff its restriction to C'S°(K) 
is continuous for all compact Kk C Q. (This so-called “inductive limit topology” 
will not be described here systematically.) We note the following facts: 


(i) a linear functional u on C'S°(Q) is continuous iff for each compact K CQ, 
there exist C > 0 and k € NU {0} such that (2) holds; 

(ii) a sequence {¢;} C Co°(Q) converges to 0 iff {6;} C Co°(k) for some 
compact K CQ, and ¢; > 0 in C>°(K). 


(iii) a linear functional wu on C3°(Q) is continuous iff u(¢;) - 0 for any 
sequence {o;} C Co°(Q) converging to 0. 


Definition II.2.2. The space Co°(Q) with the topology described earlier is 
denoted D(Q) and is called the space of test functions on Q. The elements of its 
dual D’(Q) (= the space of all continuous linear functionals on D(Q)) are called 
distributions in Q. 

The topology on D’(Q) is the “weak*” topology: the net {u,} of distributions 
converges to 0 iff u,(¢) > 0 for all ¢ € D(Q). 


II.2.3. Measures and functions. 

If is a regular complex Borel measure on Q, it may be identified with 
a continuous linear functional on C,.(Q) (through the Riesz representation 
theorem); since it is uniquely determined by its restriction to D(Q) (cf. 
Corollary II.1.3), and this restriction is continuous (with respect to the stronger 


II.2. Distributions 495 


topology of D(Q)), the measure jz can (and will) be identified with the 
distribution u := L|p(q). We say in this case that the distribution u is a measure. 

In the special case where du = fdx with f € Lj,.(Q) (ie, f “locally 
integrable”, that is, integrable on compact subsets), the function f is 
usually identified with the distribution it induces (as shown) through the 
formula 


f@:-= a bfde (6€D(O). 


In such event, we say that the distribution is a function. 

If Q/ is an open subset of Q, the restriction of a distribution u in Q to D(Q’) 
is a distribution in 9’, denoted u|q (and called the restriction of u to 0’). If the 
distributions u;,u2 have equal restrictions to some open neighborhood of x, one 
says that they are equal in a neighborhood of x. 


Proposition II.2.4. If two distributions in Q are equal in a neighborhood of 
each point of Q, then they are equal. 


Proof. Fix ¢ € D(Q), and let K = suppd. Each x € K has an open 
neighborhood 2, C Q in which the given distributions u ,u2 are equal. By 
compactness of K, there exist open sets Q; := Q2, (7 = 1,...,m) such that 
KcU , 25. Let {;} be a partition of unity subordinate to the open covering 
{Qj} of K. Then $ = >7, 69; and $¢; € D(Q;). Hence u1(¢;) = u2(¢;) by 
hypothesis, and therefore ui(¢) = u2(¢). 


II.2.5. The support. 
For any distribution u in Q, the set 


Z(u) := {@ € Q;u = 0 in a neighborhood of a} 


is open, and u|z,) = 0 by Proposition II.2.4; furthermore, Z(u) is the largest 
open subset Q' of Q such that ulq = 0 (if « € O' for such a set 0’, then u = 0 
in the neighborhood 1’ of x, that is, x € Z(u); hence Q’ C Z(u)). The support 
of u, denoted supp u, is the set 2 — Z(u) (relatively closed in Q). The previous 
statement may be rephrased as follows in terms of the support: supp u is the 
smallest relatively closed subset S of Q such that u(¢) = 0 for all ¢ € D(Q) such 
that supp@nS = 9. 


If the distribution u is a measure or a function, its support as a 
distribution coincides with its support as a measure or a function, respectively 
(exercise). 


II.2.6. Differentiation. 
Fix the open set Q Cc R”. 
For j = 1,...,n and ue D’ := D'(Q), we set 


(Dju)o = —u(Djo) (GE D:= D(Q)). 


496 Application II Distributions 


Then D;u € D’ and the map u > D,u is a continuous linear map of D’ into 
itself. Furthermore, D;,D; = D;D, for all j,k € {1,...,n}, and 


(D°u)d = (—1)*u(D%9) (¢ € D). 
For example, if 6 denotes the Borel measure 
6(B) =In(0) (EB € BR") 
(the so-called delta measure at 0), then for all ¢ € D, 
(D%5)$ = (-1)!*!6(D%¢) = (—*!(D%¢) (0). 


If u is a function such that Ou/Ozx,; exists and is locally integrable in (, then for 
all ¢ € D, an integration by parts shows that 


(Dyu)o = ~u(Dj0) = 1 f w(06/00;) de =—1 [ @u/or;)oae = (FE*) (6), 


so that Dju is the function (1/i)0u/Ox,; in this case, as desired. 


Proposition II.2.7 (du Bois-Reymond). Let u,f € C(Q), and suppose 
D;u= f (in the distribution sense). Then Dju = f in the classical sense. 


Proof. Case u € C.(Q). Let u,, f, be regularizations of u, f (using the same ¢ 
as in II.1). Then 


P Ditip(at) = / u(y)De, 6((0 — y)/r) dy = — / u(y) Dy, 0((e — »)/n) dy 
_ i: (Dju)#((@ — y)/r) dy = 9" fr. 


By Theorem IT.1.2, u, + u and Dju, = f, — f uniformly as r > 0. Therefore, 
Dj;u = f in the classical sense. 

u € C(Q) arbitrary. Let » € D(Q). Then v := wu € C.(Q), and Djv = 
(D;w)u+wvf := 9g € C(Q), so by the first case, D;v = g in the classical sense. 
For any point « € Q, we may choose w not vanishing at z. Then u = v/w 
is differentiable with respect to x; at z, and Dju = f at x (in the classical 
sense). 


Let w be an open set with compact closure contained in 2 (this relation 
between w and 2 is denoted w CC Q), and let p = diam(w) := sup{|x—y|; x,y € 
w}(<oo). Denote the unit vectors in the x,;-direction by e;. Given x € w, let t; 
be the smallest positive number t such that x + te; € Ow. If d € D(w), then 
o(a + t,;e;) = 0, and by the mean value theorem, there exists 0 < 7; < tj; such 
that 


|o(x)| = |(x) — o(@ + tye;)| = ty|Dj Ole + T3e5)| < psup|Dj¢4I. 


II.2. Distributions 497 


Hence 
sup |4| < psup|Dj9|_ (6 € D(w)). 
Therefore, for any multi-index a with |a| < k, 
sup |D*9| < p*!*! sup |Dr--- Dial, 
and consequently 
lIglla SC’ sup|Dy---Dyd| (¢ € Dw), 


where C’ = C’(p,k,n) is a positive constant. Let u € D’(Q). By (2), there exist 
a constant C > 0 and a non-negative integer / such that 


Iu(4)| < CC" sup|D¥-.. DEG] (6 € D(w)). 
Write 
(Di. DEg)(«) =i" / DEM DEM bay, 
Y<u 


where [y < a] := {y € R";y; < 2; (jf =1,...,n)} and ¢ is extended to R” in 
the usual way (¢ = 0 on w°). Writing s := k +1, we have therefore 


sup |Df-- Dhol < f |} Dil dy 
and consequently, 
|u(d)| < CCD} ---Drdllzw) (¢€ Dw). 
This means that the linear functional 
(-1)"*Dj--- Db > ul) 
is continuous with norm <CC’ on the subspace D§--- Ds D(w) of L1(w). By the 


Hahn-—Banach theorem, it has an extension as a continuous linear functional on 
L*(w) (with the same norm). Therefore there exists f € L°°(w) such that 


u(d) = (-1)"* / fD---Didde (6€ Dw). 


This means that 
uly = D---D3f. 
We may also define 


g(a) =i" : _ sae 


Then g is continuous, and one verifies easily that f = Di---Dnyg (in the 
distribution sense). Hence 


41 1 
uly = DI Da""g 


498 Application II Distributions 


(with g continuous in w). We proved the following. 


Theorem II.2.8. Let u € D’(Q). Then for any w CC Q, there exist a non- 
negative integer s and a function f € L*(w) such that ul, = Dj---Dé5f. 
Moreover, f may be chosen to be continuous. 


II.2.9. Leibnitz’s formula. 
If ¢,v € C™(Q), then for any multi-index a 


oe Bila Bla pie oO Pap (3) 
B<a 
(the sum goes over all multi-indices 6 with 8; < a; for all j). This general 
Leibnitz formula follows by repeated application of the usual one-variable 
formula. 
Multiplication of a distribution by a function. 

Let u € D’(Q) and  € C™*(Q). Since Wo € D(Q) for all ¢ € D(Q), the map 
go — u(w¢) is well defined on D(Q). It is trivially linear, and for any compact 
Kk CQ, if C and k are as in (2), then for all 6 € CS°(K), we have by (3) 


|u(wo)| < Cllvdl|le < C’llAlle, 


where C’ is a constant depending on n,k, K, and w. This means that the map 
defined above is a distribution in Q; it is denoted wu (and called the product of 
u by w). Thus 
(pu)(d) = ud) (¢E DQ) (4) 
for all » € C*(Q). 
This definition clearly coincides with the usual pointwise multiplication when 


u is a function or a measure. 
We verify easily the inclusion Z(w) U Z(u) C Z(au), that is, 


supp(qu) C supp w NM supp u. 
It follows from the definitions that 


Dj (hu) = (Dyp)u t+ p(Dju) 
(pb € C™@(Q), u € D'(Q), j = 1,...,n), and therefore, by the same formal 
arguments as in the classical case, Leibnitz’s formula (3) is valid in the present 
situation. 

If P is a polynomial on R”, we denote by P(D) the differential operator 
obtained by substituting formally D for the variable x € R”. Let P,(a) := x 
for any multi-index a. Since P&?) (x) = (a!/(a— B)!)x°-* for B < a (and equals 
zero otherwise), we can rewrite (3) in the form 


P(D) (Wu) = 95 (1/8!)(D°h) P (D)u (5) 
B 


for the special polynomials P = P,, hence by linearity, for all polynomials P. 
This is referred to as the “general Leibnitz formula”. 


II.2. Distributions 499 


JI.2.10. The space €((Q) and its dual. 
The space E(Q2) is the space C°°(Q) as a (locally convex) t.v.s. with the 
topology induced by the family of semi-norms 


¢ > Idllk.x = S— sup|D°4|, (6) 


la|<k 
with k = 0,1,2,..., and K varying over all compact subsets of 2. 


A sequence {¢,} converges to 0 in €(Q) iff D°¢, — 0 uniformly on every 
compact subset of 2, for all multi-indices a. 

A linear functional u on €(Q) is continuous iff there exist constants k € NU{0} 
and C > 0 and a compact set K such that 


lu(d)| < Clldllanx (6 € E(Q)). (7) 


The dual space €’(Q) of E(Q) consists of all these continuous linear functionals, 
with the weak*-topology: the net u, converges to u in E’(Q) if up(¢) > u(@) for 
all ¢ € E(Q). 

If uw € €'(Q), then by (7), u(¢d) = 0 whenever ¢ € E(Q) vanishes in a 
neighborhood of the compact set K (appearing in (7)). For all ¢ € D(Q), 
|u(d)| < Clldllanw < Clld|lx, that is, @ := ulpia) € D’'(Q). Also, ife ce A-K 
and w is an open neighborhood of « contained in 2 — K, then for all ¢ € D(w), 
|$|la,~ = 0, and (7) implies that u(d) = 0. This shows that Q— K C Z(a), that 
is, suppu C K, that is, wis a distribution with compact support in Q. 

Conversely, let v be a distribution with compact support in 2, and let Kk 
be any compact subset of 2 containing this support. Fix w € D(Q) such that 
w = 1 ina neighborhood of K. For any ¢ € €({), define u(@) = v(dy). Then 
u is a well-defined linear functional on E(Q), u(d) = 0 whenever ¢ = 0 ina 
neighborhood of K, and u(¢) = v(¢) for ¢ € D(Q). On the other hand, if w is a 
linear functional on €(Q) with these properties, then for all ¢ € E(Q), dw € D(Q) 
and ¢(1 — w) = 0 in a neighborhood of K, and consequently, 


w(¢) = w(gy) + w(O(1 — ¥)) = v(oy) = ul). 


This shows that v has a unique extension as a linear functional on €(Q) such 
that v(¢) = 0 whenever ¢ vanishes in a neighborhood of K. 
Let Q = supp w. By (2) applied to the compact set Q, there exist C and k 
such that 
|u(d)| = |v(ov)| < Cll oul 


for all ¢ € E(Q) (because dy € D(Q)). Hence, by Leibnitz’s formula, |u(¢)| < 
C"||$||k,q for some constant C’, that is, wu € €’(). 

We have established, therefore, that each distribution v with compact support 
has a unique extension as an element u € €’(Q), and conversely, each u € E’(Q) 
restricted to D(Q) is a distribution with compact support. This relationship 
allows us to identify E’(Q) with the space of all distributions with compact support 
in Q. 


500 Application II Distributions 


II.2.11. Convolution. 
Let u € D’ := D’(R") and ¢ € D:= D(R”). For x € R” fixed, let 


(u* 6)(x) = u(d(a — -)). 
The function u * ¢ is called the convolution of u and ¢. 

For x fixed, h 4 0 real, and j = 1,...,n, the functions (ih)~![¢(x + he; —-) — 
$(a —-)] converge to (D;¢)(x —-) (as h > 0) in D. Therefore, (ih)~*[(u* ¢)(a + 
he;) — (ux )(x)] + (ux (D;¢))(2x), that is, Dj(u*) = ux (D;¢) in the classical 
sense. Also ux(Dj) = (D;u)*¢ by definition of the derivative of a distribution. 
Iterating, we obtain that u* ¢ € € := E(R”) and for any multi-index a, 


D°(u* @) = ux (Dd) = (D%u) * 9. (8) 


If supp uM supp ¢(x2 — -) = 0, then (u * d)(x) = 0. Equivalently, if (u* ¢)(x) 4 0 
then supp u meets supp ¢(x — -) at some point y, that is, « — y € supp¢ and 
y € supp u, that is, c € suppu+supp ¢. This shows that 


supp (u * d) C supp u + supp ¢. (9) 
Hence 
ElaDCD (10) 
and in particular 
DeDeD, (11) 


Let ¢, — 0 in D. There exits then a compact set K containing supp ¢,, for 
all m, and D“¢,, — 0 uniformly for all a. Let Q be any compact set in R”. It 
follows that supp dn(x# —-) CQ-—K := {a-—y;x € Q,y € K} for all x € Q. By 
(2) with the compact set Q — K and the distribution D°u, there exist Ck such 
that 


|D°(u * bm)(®)| = |(D°u)(bm(x — +))| S Cllom(a — -)Ilk = Cllomlla > 9 


for all x € Q. Hence, D*(u * dm) > 0 uniformly on Q, and we conclude that 
ux bm — 0 in the topological space €. In other words, the (linear) map ¢ > ux@ 
is sequentially continuous from D to €. If u € €', the map is (sequentially) 
continuous from D into itself (cf. (10)). In this case, the definition of ux ¢ makes 
sense for alld € E, and the (linear) map ¢ > ux¢ from E into itself is continuous 
(note that E(Q) is metrizable, so there is no need to qualify the continuity; the 
metrizability follows from the fact that the topology of €(Q) is induced by the 
countable family of semi-norms {|| - ||x,«,,34,m = 0,1,2,...}, where {Km} isa 
suitable sequence of compact sets with union equal to {). 

If ¢,W € D and u€ D’, it follows from (11) and the fact that ux ¢ € € that 
both u* (@* Ww) and (ux ¢) * w make sense. In order to show that they coincide, 
we approximate (¢ * w)(”) = Jo (a — y)v(y) dy (where Q is an n-dimensional 
cube containing the support of yw) by (finite) Riemann sums of the form 


Xm(x) =m—" S> Ga — y/myb(y/m). 


yEeZr 


II.2. Distributions 501 


If ¥m(a) 4 0 for some x and m, then there exists y € Z” such that y/m € 
supp w and «— y/m € supp ¢, that is, c € y/m+supp ¢ C suppv+supp ¢. This 
shows that for all m, ym have support in the fixed compact set supp w + supp @. 
Also for all multi-indices a, 


(D°Xm)(x) = m—" Y 7 D°d(a—y/m)b(y/m) + (Dd) 1h) (x) = (D* (Gb) (2) 


uniformly (in x). This means that x, — ¢* v in D. By continuity of u on D, 
fu (6 * W)](0) = u((d *d)(@ —-)) = lim u(Xm(x — )) 
= lim (1 * Xm) () = lim m=” Y>(u 4) (a — y/m)b(y/m) 
= [(uxd) * ¥](2), 
for all x, that is, 
ux (bey) =(uxd)ew (¢,beD ued). (12) 


Fix ¢ as in Section II.1, consider ¢, as before, and define u, := u * ¢, for any 
ued’. 


Proposition I1.2.12 (Regularization of distributions). For any 
distribution u in R”, 


(i) uy € E for all r > 0; 
(48) supp tip C supp + {ar |0| <r}; 
(itt) up + u in the space D’. 
Proof. Since supp ¢, = {z;|z| < r}, (i) and (ii) are special cases of properties 


of the convolution discussed in I1.2.11. 
Denote J: W(x) € D > W(x) := (—2). Then 


u(t) = (u* d)(0). (13) 
By Theorem IL.1.2 applied to #, ¢, * # > ¢ in D. Therefore, by (13), 


ur(W) = [(u* d,) * P](0) = [u* (, * P)](0) 
u(J(b, *B)) > ub) 


I 


for all w € D, that is, u, > uin D’. 


In particular, € is sequentially dense in D’. 
Note also that if ux D = {0}, then u, = 0 for all r > 0; letting r — 0, it 
follows that u = 0. 


502 Application II Distributions 


II.2.13. Commutation with translations. 
Consider the translation operators (for h € R”) 


Th: O(x) > O(a — h) 
from D into itself. For any u € D’, it follows from the definitions that 


Th(ux db) = ux (tnd), 


that is, convolution with u commutes with translations. This commutation 
property, together with the previously observed fact that convolution with u 
is a (sequentially) continuous linear map from D to €, characterizes convolution 
with distributions. Indeed, let U : D > E be linear, sequentially continuous, and 
commuting with translations. Define u(¢) = (U(@))(0),(¢@ € D). Then uw is a 
linear functional on D, and if ¢, — 0 in D, the sequential continuity of U on D 
implies that u(¢@,) + 0. Hence u € D’. For any x € R” and dE D, 


(U$)(x) = [r-2U9](0) = [U(7_2)] (0) 
= u(J (7-2) = [w* (tT-2)](0) = [7-2 (u * 6)](0) = (u* 6) (x), 


that is, Ué = ux ¢, as wanted. 


II.2.14. Convolution of distributions. 
Let u,v be distributions in R”, one of which has compact support. The map 


W :¢G€Daux(v«ed EE 


is linear, continuous, and commutes with translations. By II.2.13, there exists a 
distribution w such that W¢é = w « @. By the final observation in II.2.12, w is 
uniquely determined; we call it the convolution of u and v and denote it by ux v; 
thus, by definition, 


(usv)*d=ux(ved) (6€D). (14) 


If v = w € D, the right-hand side of (14) equals (u * 7) * @ by (12) (where 
uxw is the “usual” convolution of the distribution u with w). Again by the final 
observation of II.2.12, it follows that the convolution of the two distributions 
u,v coincides with the previous definition when v is a function in D (the same 
is true ifu ce E’ andv= we €). 

One verifies easily that 


supp(u * v) C suppu + supp v. (15) 
(With ¢, as in Section II.1, it follows from (9) and Proposition II.2.12 that 


supp|(u * v) * d,| = supp[u * (v x ¢,)] C supp u + supp vu; 
C suppu+ suppv + {2;|z| <r}, 


II.3. Temperate distributions 503 


and we obtain (15) by letting r > 0.) 


If u,v, w are distributions, two of which (at least) having compact support, 
then by (15) both convolutions (wu * v) * w and u * (uv * w) are well-defined 
distributions. Since their convolutions with any given ¢ € D coincide, the 
“associative law” for distributions follows (cf. end of Proposition IT.2.12). 

Convolution of functions in € (one of which at least having compact support) 
is seen to be commutative by a change of variable in the defining integral. If u € 
D' and wv € D, we have for all ¢ € D (by definition and the associative law we 
just verified!): 

(ux) «b= ux (wd) ue (d¥¥) 
= (ux o)*p = (ux od) = (~*u) *¢, 
and therefore ux w = w*u. The same is valid (with a similar proof) when u € €' 
and we €. 

For any two distributions u, v (one of which at least having compact support), 
the commutative law of convolution follows now from the same formal calculation 
with w replaced by v. 

Let 5 be the “delta measure at 0” (cf. II.2.6). Then for any multi-index a 
and ueéeD’, 

(D6) *u = Du. (16) 
Indeed, observe first that for all ¢ € D, (6 * ¢)() = f d(a — y) dd(y) = (2), 
that is, 6* 6 = ¢. Therefore, for any v € D’, (ux 0) *6=vux*(d*o) =vu*d, and 
consequently 

vxed=vu (VED). (17) 
Now for all ¢ € D, 
(ux D“6) *@=ux (D6 * 6) = ux (D° dd) 

=ux* D°¢ = (D%u) « 4, 
and (16) follows (cf. end of II.2.12 and (8)). 

Next, for any distributions u,v with one at least having compact support, we 

have by (16) and the associative and commutative laws for convolution: 
D°(ux*v) = (D6) * (u* v) = (D%) * u) *u = (D@u) xv 
= D*(u*u) = (D%v) «u = ux (Dv). 
This generalizes (8) to the case when both factors in the convolution are 
distributions. 


II.3. Temperate distributions 


II.3.1. The Schwartz space. 
The Schwartz space S = S(R”) of rapidly decreasing functions consists of all 
@€ € = €(R") such that 


IIPlla,g = sup |2°D*4(x)| < oo. 


504 Application II Distributions 


The topology induced on S by the family of semi-norms |] -||o,¢ (where a, 3 range 
over all multi-indices) makes S into a locally convex (metrizable) topological 
vector space. It follows from Leibnitz’s formula that S is a topological algebra 
for pointwise multiplication. It is also closed under multiplication by polynomials 
and application of any operator D®, with both operations continuous from S into 
itself. 


We have the topological inclusions 
DES Ce SC hs= LR") 
(Let pn(x) = Wi + £7); since ||1/pnl|z1 = 2”, we have 


Illa. <7" sup [pn (x) o(a)]. (1) 


If d; > 0 in S, it follows from (1) that ¢, > 0 in L1.) 

Fix ¢ as in II.1.1, and let y be the indicator of the ball {a;|z| < 2}. The 
function ~ := y « ¢ belongs to D (cf. Theorem II.1.2), and w = 1 on the closed 
unit ball (because for |x| < 1, {y;|y| < 1 and |x—y| < 2} = {y;|y| < 1} = supp 4, 
and therefore (1) = fipgX(@ — y)o(y) dy = J oly) dy = 1). 

Now, for any ¢ € S and the function w defined earlier, consider the functions 
or(x) := o(x)d(rx),(r > 0) (not to be confused with the functions defined in 
II.1.1). Then ¢, € D and ¢— ¢, = 0 for |x| < 1/r. Therefore ||¢ — ¢;|la,6 = 
SUP|2|51/r |x? D°(¢—¢@,.)|. We may choose M such that sup,, |7°|a|?D°(¢—¢,)| < 
M for all 0 <r < 1. Then, ||6—@la,6 < SUPj2|31/r ae < Mr? + 0whenr = 0. 
Thus, ¢, > ¢ in S, and we conclude that D is dense in S. A similar argument 
shows that ¢, > ¢ in € for any ¢ € E; hence D (and therefore S) is dense in €. 


II.3.2. The Fourier transform on S. 
Denote the inner product in R” by x-y (a@-y := }7, x;y), and let F: f > f 
be the Fourier transform on L!: 


foy= [et ry@)ae (fe). (2) 


(All integrals in the sequel are over R”, unless specified otherwise.) 

If ¢€ S,x%¢(x) € S C L’ for all multi-indices a, and therefore, the integral 
fe "4 (—2)*¢(x) dx converges uniformly in y. Since this integral is the result 
of applying D® to the integrand of (2), we have @ € € and 


D*$ = Fl(-2)*4(2)]. (3) 
For any multi-index £, it follows from (3) (by integration by parts) that 
y?(D°$)(y) = FD*[(—2)*¢(a)]. (4) 


In particular (a = 0) 


y° oly) = [FD* 4](y). (5) 


II.3. Temperate distributions 505 


Since D®[(—a)“(x)] € S C L!, it follows from (4) that y°(D°¢@)(y) is a bounded 
function of y, that is, 6 € S. Moreover, by (1) and (4), 


sup |y°(D°4)(W)| <ID*[(-2)*9@) la 


Pn(2)D* [(—2)*4(2)]]. 


<a” sup 
This inequality shows that if ¢, — 0 in S, then dk +> 0 in S, that is, the map 
F:¢€S > ¢€S isa continuous (linear) operator. Denote 
MP : d(2) € S 3 2? o(z) € S. 
Then (4) can be written as the operator identity on S: 
M*® D°F = FD*(—-M)°*. (6) 
A change of variables shows that 
(Fu(ry))(s) = r-"B(s/r) (7) 


for any » € Li andr > 0. 
If ¢,~ € L', an application of Fubini’s theorem gives 


[eee ay= fie- xo a= [ i(sjoe+s)ds. (8) 


Replacing ~(y) by ~(ry) in (8), we obtain by (7) and the change of variable 


s=rt 
ior (ry)ei¥ dy =r 4 s/r)o(a +s )as= fate g(a +rt)dt. (9) 


In case 6,7) € S(C L!), we have ¢,7) € S C L, and ¢,~ are bounded and 
continuous. By Lebesgue’s dominated convergence theorem, letting r > 0 in (9) 


gives 
0) / bye dy = 9(c) / d(t) at (10) 


for all ¢,y € S and x € R”. : ; ; 

Choose, for example, y(x) = (21)7"/%e7l#!'/2. Then a(t) = e7!4l'/? (ef. 
1.3.12) and f <(t) dt = (27)"/?. Substituting these values in (10), we obtain (for 
all 6 € S) 


= (2n)-" i Bye” dy. (11) 


This is the inversion formula for the Fourier transform F on S. It shows that F 
is an automorphism of S, whose inverse is given by 


Pole Qn) UF, (12) 


506 Application II Distributions 


where J: 67> d. 
Note that JF = FJ and F? = (2r)"J. 
Also, by definition and the inversion formula, 


By) = i VEG) dr = (2n)"(F9)(y), 


that. is, ' 


F(W) =(2r)"b (be 8). (13) 


(It is sometimes advantageous to define the Fourier transform by F = (27)~"/?F; 

the inversion formula for F : S > S is then F~! = JF, and the last identities 

become F? = J and FCF = C, where C' : ~) > W is the conjugation operator.) 
An application of Fubini’s theorem shows that 


F(oxv)=ob (b,¥eS). (14) 
(This is true actually for all 4, € Lt.) 
Replacing ¢, = by d, w(E S), respectively, we get 
F(p* pb) = (F?6)(F?p) = (20)?"(J9) (Jy) 
= (2n)?" I (py) = (20)"F* (py). 

Hence es 

F (oy) = (27) "ox (9, WES). (15) 
For x = 0, the identity (8) becomes 


[ee dx = [ dx (¢,veL). (16) 


In case ~ € S (so that 7 € S C L*), we replace w by my in (16); using (13), 
we get 


/ bib de = (2n)" i: dbde (0S). (17) 


This is Parseval’s formula for the Fourier transform. 
In terms of the operator F, the formula takes the form 


(F¢,Fv) = (6,0) (¢,~ €S), (18) 


where (-,-) is the L? inner product. 
In particular, 


IF oll2 = Il¢lle, (19) 


where || - ||2 denotes here the L?-norm. 

Thus, F is a (linear) isometry of S onto itself, with respect to the L?-norm on 
S. Since S is dense in L? (recall that D C S, and D is dense in L?), the operator 
F extends uniquely as a linear isometry of L? onto itself. This operator, also 
denoted F, is called the L?-Fourier transform. 


11.3. Temperate distributions 507 


Example. Consider the orthonormal sequence { f,; k € Z} in L*(R) defined in 
the second example of Section 8.11. Since F is a Hilbert automorphism of L?(R), 
the sequence {gx := F fx; k € Z} is orthonormal in L?(R). The fact that f;, are 
also in L1(R) allows us to calculate as follows 


wT 


au(y) = (1) Vm) f eV fe(a) de = (1/2n) f en HW-F) dy 


TT 
, Ssinwy 


~CY5U= Bh 


Note in particular that 


sin“ (7y) 
[= y= IIgolls = =1, 


that is, f, sin? t/t? dt = 7. Integrating by parts, we have for all a < 6 real 


b b b 
/ sin? t/t? dt =| sin? td(—1/t) = sin? a/a — sin? bo f sin(2t)/t dt. 
Letting a + —oo and b — ov, we see that the integral [, sin(2t)/tdt converges 
and has the value 7, that is, 
t 
[5 = dt = 7. 


This is the so-called Dirichlet integral. 


If g =F f is in the closure of the span of {g,} in L?(R), f is necessarily in the 
closure of the span of {f,}, hence, vanishing on (—7,7)°; in particular, f is also 
in L1(R), and therefore g is continuous. Also g has the unique L?(R)-convergent 
generalized Fourier expansion g = > ,<¢7 @egr (equality in L”). Since both {a;,} 
and {||gx|loo} are in 1?(Z), it follows (by Schwarz’s inequality for 1?(Z)) that 
the shown series for g converges (absolutely and) uniformly on R; in particular, 
> axge is continuous. Since g is continuous as well, g = >> axgx everywhere, 


that is, 
_ sin sin Ty og. )Fax/( — k). 


Letting y > n for any given n € Z, we get g(n) = a, Thus 


) = BB S*(-1)F9(b)/y - &). 
k 


II.3.3. The dual space S’. 
If u € S’, then ulp € D’ (because the inclusion D C S is topological). 
Moreover, since D is dense in S,u is uniquely determined by its restriction to D. 


508 Application II Distributions 


The one-to-one map u € S’ > u|p € D” allows us to identify S’ as a subspace of 
D’; its elements are called temperate distributions. We also have €’ C S’: 


ECS’ cD, 
topologically. 


Note that L? Cc S’ for any p € [1, ox]. 
The Fourier transform of the temperate distribution u is defined by 


u(g) =ulo) (dS). (20) 


If u is a function in L*, it follows from (16) and the density of S in L' that 
its Fourier transform as a temperate distribution “is the function %”, the usual 
L-Fourier transform. Similarly, if u € M(R"), that is, if u is a (regular Borel) 
complez measure, % “is” the usual Fourier—Stieltjes transform of u, defined by 


a(y) = ‘| edule) (y eR"), 


In general, (20) defines & as a temperate distribution, since the map F': 6 > db 
is a continuous linear map of S into itself and & := uo F. We shall write Fu 
for & (using the same notation for the “extended” operator); F is trivially a 
continuous linear operator on S’. 

Similarly, we define the (continuous linear) operators J and F on S’ by 


(Ju)(g) =u(Jo) (¢€S) 


and F = (2n)-"/?F. 
We have for all dE S 


(F°u)(¢) = u(F*d) = (2m)"u( Jd) = (27)"(Ju)(@), 


that is 
FA = (Qe)? 
on S’. It follows that F is a continuous automorphism of S’; its inverse is given 
by the Fourier inversion formula F~! = (27)~"JF (equivalently, F~' = JF) 
on S’. 
It follows in particular that the restrictions of F to L' and to M := M(R") are 
one-to-one. This is the so-called uniqueness property of the L'-Fourier transform 


and of the Fourier—Stieltjes transform, respectively. 
If u € L?(c S’), then for all ES 


\(Fu)($)| = lu(F8)| = [[u Foae 


S |lullallF olla = lellellella- 


Thus, Fu is a continuous linear functional on the dense subspace S of L? with 
norm < |lul|2. It extends uniquely as a continuous linear functional on L? with 


II.3. Temperate distributions 509 


the same norm. By the (“Little”) Riesz representation theorem, there exists a 
unique g € L? such that ||g||2 < |Jul|2 and 


(Fu)() = i égdx (pS). 


This shows that Fu “is” the L?-function g, that is, F maps L? into itself. The 
identity u = F? Ju shows that FL? = L?. Also ||Full2 = |\g|l2 < ||ull2, so that 


llull2 = |Jull2 = |F?ulle < ||Fulle, 


and the equality ||Ful|2 = ||ul|2 follows. This proves that F|z2 is a (linear) 
isometry of L? onto itself. Its restriction to the dense subspace S of L? is the 
operator F originally defined on S; therefore F|,2 coincides with the L? Fourier 
transform defined at the end of II.3.2. 

The formulae relating the operators F, D®, and M* on S extend easily to 
S': if u € S’, then for all d € S, 


[FD° ul = (D°u)(F¢) = u ((-D)’F¢) 
= u(FM"9) = (M°Fu)(9), 
that is, FD? = M®F on S’. By linearity of F, it follows that for any polynomial 
Pon R",FP(D) = P(M)F on S’. 


Theorem II.3.4. If u € &', & “is the function” u(y) := ule”) (y is a 
parameter on the right of the equation), and extends to C” as the entire function 
W(z) = ule), (2 € C”). In particular, i € E. 


Proof. 


(i) Case u € D(C E’). Since ~(z) = ates 
entire, and coincides with u on R”. 


e'!*7u(x) da, it is clear that w is 


(ii) General case. Let @ be as in II.1.1, and consider the regularizations u, := 
ux o, € D. By Proposition II.2.12, u, — uin D’ and supp u, C supp u + 
{z;|z| < 1} := K for all 0 < r < 1. Given ¢ € E, choose ¢’ € D that 
coincides with ¢ in a neighborhood of kK. Then for 0 < r < 1,u,(¢) = 
ur(¢') > u(¢’) = u(d) as r > 0, that is, uy > u in €’, hence also in S’. 
Therefore é, — & in S’ (by continuity of the Fourier transform on S’). 
By Case (i), &-(y) = u,-(e”¥) > ule”), since u, > u in €’. More 
precisely, for any z € C”, 


ur (e7'*"*) = [(u* pr) « e!*](0) = [u # (pr * e'*)] (0). 


However, 
bpactee arn f ele =6(y/r) dy 


= eit2 petro dt = el d(rz). 


510 Application II Distributions 


Therefore, . . 
ur(eW'**) = @(rz)(ux*e'™*)(0) = o(rzjule*), 
and so, for al O<r<1, 


lur(e7**) — u(e“**)| = |b(rz) — 1] Jule“"*)| 
a seer — 1)¢(x) dz] |u(e****)| 


< Clie“ |lk,xlalel"lr, 


with the constants C,k and the compact set K independent of r. Consequently 

ur(e—!”’?) > u(e”’*) as r > 0, uniformly with respect to z on compact subsets 

of C”. Since u,(e~!”’*) are entire (cf. Case (i)), it follows that u(e~”’*) is entire. 
In order to verify that @ is the function u(e~'”’Y), it suffices to show that 


(6) = / o(yyu(e™¥) dy 


for arbitrary ¢ € D, since D is dense in S. Let K be the compact support of ¢, 
and 0 < r < 1. Since (y)a,(y) > (y)u(e”) uniformly on K as r + 0, we 
get 

a(6) = fim @-(@) =tim f o(y)an(u)ay = fh o(uyule™) a, 


r—0 


as desired. 


The entire function u(e~!”’*) will be denoted a(z); since the distribution & 
“is” this function restricted to R” (by Theorem II.3.4), the notation is justified. 
The function t%(z) is called the Fourier—Laplace transform of u € E'. 


Theorem IT.3.5. [fue €’ andv € S’, thenuxv€S' and F(uxv) = wtb. 


Proof. By Theorem II.3.4, @ € €, and therefore the product «+ makes sense as 
a distribution (cf. (4), I.2.9). We prove next that ux v € S’; then F(ux* v) will 
make sense as well (and belong to S’), and it will remain to verify the identity. 
For all ¢ € D, 
(ux v)() = [(u* v) * JA](0) = [u* (v * JG)](0) 

= u(J(u * Jé)) = (Ju)(v « J¢). (21) 
Since Ju € €’, there exist K C R” compact and constants C > 0 and k € NU{0} 
(all independent of ¢) such that 

|(u* v)(@)| < Cllu * Jella,w (22) 


(cf. (7) in II.2.10). 
For each multi-index a, D°v € S’. Therefore, there exist a constant C’ > 0 
and multi-indices 8,7 (all independent of ¢ and x), such that 


|(D*v * JG)(x)| = |(D°v) (oy — #))| < Cc sup ly°(D74)(y)|. (23) 


II.3. Temperate distributions 511 


By (22) and (23), |(u * v)(@)| can be estimated by semi-norms of ¢ in S (for 
all ¢ € D). Since D is dense in S,u * v extends uniquely as a continuous linear 
functional on S. Thus, u*v € S’. 

We now verify the identity of the theorem, first in the special case u € 
D. Then t € S, and therefore ti € S’ (by a simple application of Leibnitz’s 
formula). 

By (21), if w € S is such that # € D, 


[F(u* v)](h) = (ux v)(Q) = (v*u)(b) = v(J(u* Jd). 


For f,g € D, it follows from the integral definition of the convolution that 
J(f *g) = (Jf) * (Jg). Then, by (12) and (15) in II.3.2, 


[F(u* v)]() = v((Ju) * b) = v[(20)-" (Fu) * FY)] 
= o( F(a) = o(aep) = (46) (%), 


where the last equality follows from the definition of the product of the 
distribution 6 by the function @ € S C €. Hence F(u * v) = td on the set 
So= {We Siwe D}. For w € S arbitrary, since w € S and D is dense in S, 
there exists a sequence ¢, € D such that 6, > wo in S. Let wy, = F7'd,. Then 
vp € S, Fup = by € D (that is, yp € So), and yx > w by continuity of F~! 
S (i.e., So is dense in S). 

Since F(u xv) and dé are in S’ (as observed) and coincide on So, they are 
indeed equal. 

Consider next the general case u € €’. If bE D, 6x ue Dandve S’, and 
also ¢ € D and ux v € S’. Applying the special case to these two pairs, we get 


(dt)6 = [F(¢ * u)]6 = F[(p * u) * o 
= Flo (u*v)| = 6F(u*v). 


For each point y, we can choose ¢ € D such that $(y) 4 0; it follows that 
ti = F(uxv). 


The next theorem characterizes Fourier—Laplace transforms of distributions 
with compact support (cf. Theorem II.3.4). 


Theorem II.3.6 (Paley—Wiener—Schwartz). 


(i) The entire function f on C” is the Fourier—Laplace transform of a 
distribution u with support in the ball S4 := {a € R”;|xz| < A} iff there 
exist a constant C > 0 and a non-negative integer m such that 


If) C+ [2)y™et*l_ (z € C*). 


(ii) The entire function f is the Fourier-Laplace transform of a function 
u€D(Sy,) iff for each m, there exists a positive constant C = C(m) such 
that 

Fs Cmd+la) met! EC"). 


512 Application II Distributions 


Proof. Let r > 0, and suppose u € €’ with suppu C S4. By Theorem II.2.8 
with w = {x;|z| < A+r}, there exists g € L°(w) such that u|,, = Dj---D¥g. 
We may take g and s independent of r for all 0 < r < 1. Extend g to R” by 
setting g = 0 for |x| > A+ 1 (then of course ||g||z1 < co). Since supp u C w, we 
have u = Dj} --- D5 g, and therefore, 


la(2)| = | [etesDi--- Diao) aa 


Z | [Pi Dil )ale) de 


7 | [core tq Se g(a) dar 


< c(l a jz) IN2l, 


S (lz1]---[2n1)*llgllz2 sup e*9* 
Ww 


for any constant C > |lg||z1,m = sn, and r < 1 (since | - Sz] < |a|/Sz| < 
(A+r)|Sz| on w, by Schwarz’s inequality). Letting r + 0, we obtain the necessity 
of the estimate in (i). 

If u € D with support in $4, we have for any multi-index 6 and z € C” 


|2?a(2)| = |(FD?u)(z)| =| fe (D8 u)(a) da| < ||D°ullrre4!*, 
Sa 
and the necessity of the estimates in (ii) follows. 

Suppose next that the entire function f satisfies the estimates in (ii). In 
particular, its restriction to R” is in L', and we may define u = (20)~"/? JF f |pn. 
The estimates in (ii) show that y* f(y) € L'(R”) for all multi-indices a, and 
therefore D® may be applied to the integral defining u under the integration 
sign; in particular, u € €. 

The estimates in (ii) show also that the integral defining u can be shifted 
(by Cauchy’s integral theorem) to R” + it, with t € R” fixed (but arbitrary). 
Therefore 


lu(ae)| < C(m) exp[Ale| — a f / (1+ yl) dy 


for all c € R” and m € N. Fix m so that the last integral converges, and 
choose t = Axv(A > 0). Then for a suitable constant C’ and |z| > A, |u(x)| < C’ 
exp[—A|z|(|2| — A)] > 0 as A > oo. This shows that suppu C S4, and so 
u € D(S,4). But then its Fourier—Laplace transform is entire, and coincides 
with the entire function f on R” (hence on C”), by the Fourier inversion 
formula. 

Finally, suppose the estimate in (i) is satisfied. Then f|pn € S’, and therefore 
fla» = & for a unique u € S’. It remains to show that suppu C S'i4. Let ¢ be 
as in II.1.1, and let u, := u * ¢, be the corresponding regularization of u. By 
Theorem II.3.5, ti; = aib,. Since ¢, € D(S;,), it follows from the necessity part 
of (ii) that |d,(z)| < C(m)(1+|z|)7™e"!S+! for all m. Therefore, by the estimate 


in (i), 


Lf (z)br(z)| < C'(R)(1 + [z|) Bel AtIS# 


for all integers k. By the sufficiency part of (ii), the entire function f(z)¢,(z) is 
the Fourier—Laplace transform of some w, € D(S4+,). Since F is injective on S’, 


II.3. Temperate distributions 513 


we conclude (by restricting to R”) that the distribution u, “is the function” 
y,. In particular, suppu, C S44, for all r > 0. If x € D has support in 
(the open set) S%, there exists ro > 0 such that suppy C S4,,,; then for 
all 0 < r < ro,ur(x) = 0 because the supports of u, and y are contained in 
the disjoint sets S44, and $4... (respectively). Letting r — 0, it follows that 
u(x) = 0, and we conclude that u has support in S4. 


Example. Consider the orthonormal sequences {fx;k € Z} and {gx;k € Z} 
in L?(R) defined in the example at the end of II.3.2. We saw that f € L?(R) 
belongs to the closure of the span of {f,} in L?(R) iff it vanishes in (—7,7)°. 
Since F is a Hilbert isomorphism of L?(R) onto itself, g := Ff € L?(R) belongs 
to the closure of the span of {g,} iff it extends to C as an entire function of 
exponential type < m (by Theorem II.3.6). The expansion we found for g in that 
example extends to C: 


g(z) = (1/m) sinnz )1(-1)*g(k)/(z-k) (2 €C), (24) 


k 


where the series converges uniformly in |Rz| < r, for each r. We proved that 
any entire function g of exponential type < 1, whose restriction to R belongs to 
L?(R), admits the expansion (24). 


Suppose fh is an entire function of exponential type < 7, whose restriction to 


R is bounded. Let g(z) := [h(z) — h(0)]/z for z £0 and g(0) = h’(0). Then g is 
entire of exponential type < 7, and g|p € L?(R). Applying (24) to g, we obtain 


h(z) = h(0)+ (1/7) sin rz (8+ DONA) MOM /A+1/(2—H)]). (25) 
k#0 
The series s(z) in (25) can be differentiated term by term (because the 


series thus obtained converges uniformly in any strip |tz| < r). We then 
obtain 


h'(z) = cos mz(h' (0) + s(z)) + (1/7) sin rz So (-1)"[h(k) — h(0)|/(z—)*. 
k£0 


In particular, 


_, A(R) — h(0 
=) Dy 


keZ 


Since )>,<7(—1)*~1/(2k — 1)? = 0, we can rewrite the last formula in the form 


h(k) 


Wi (4/2) = (4/n) (1) 


keZ 


(26) 


514 Application II Distributions 


For t € R fixed, the function h(z) := h(z +t — 1/2) is entire of exponential type 
< r, sup [hy | = sup |) | := M < o, and h’(1/2) = h'(t). Therefore, by (26) 


applied to h, 
h(t +k—1/2) 


(0) = (fn) Dy (27) 
keZ 
Hence (cf. example at the end of Terminology 8.11) 
/ 1 2 
MOS G/M aes = (Al) M (a? A) = WM. (28) 
keZ 
Thus, considering h restricted to R, 

I[h’loo < TI[Alloo: (29) 


Let 1 < p < ~, and let q be its conjugate exponent. For any simple measurable 
function ¢ on R with ||@||, = 1, we have by (27) and Holder’s inequality 


/ = Eire \ 
[wow = (4/z) 2 (ok 1p pu + k — 1/2)d(t) dt 
< (Am) > apa apallllelélle = #lly 
keZ 


Taking the supremum over all such functions @, it follows that 
I[P'llp < mIhllp- (30) 


(For p = 1, (30) follows directly from (27): for any real numbers a < 8, 


b 
[ w@lees i) XO pall = all 


keZ 


and (30) for p = 1 is obtained by letting a > —oo and b > ov.) 

If f is an entire function of exponential type < v > 0 and is bounded on R, 
the function h(z) := f(z/v) is entire of exponential type < 7;||Alloo = ||f llc 
(norms in L*(R)); and h’(t) = (a/v) f'(at/v). A simple calculation starting 
from (30) for h shows that 

IF'lle <7llflle (31) 


for all p € [1, oo]. This is Bernstein’s inequality (for f entire of exponential type 
< v, that is bounded on R). 


II.3.1 The spaces W,,; 


II.3.7. Temperate weights. 
A (temperate) weight on R” is a positive function k on R” such that 


k(a+y) 


ku) = (1+Cla|)"  (a,y €R") (32) 


II.3. Temperate distributions 515 


for some constants C,m > 0. 
By (32), 


k(x + y) 
k(y) 


and it follows that k is continuous, satisfies the estimate 


(1+Cla|)~™ < <(+Clz/)" (x,y eR"), (33) 


kK(O)(1+Cla|)~-™ < k(x) < k(0)(1+Clz|)™ (@ ER”), (34) 
and 1/k is also a weight. 


For any real s,k* is a weight (trivial for s > 0, and since 1/k is a weight, the 
conclusion follows for s < 0 as well). 

An elementary calculation shows that 1+ ||? is a weight; therefore, k.(x) := 
(1 +|a|?)*/? is a weight for any real s. 

Sums and product of weights are weights. For any weight k, set 


(2) = sup EY) (35) 
so that, by definition, 
K(x) < (1+ Cla)" and k(a+y) < k(a)k(y). (36) 
Also 
k(a + y) < k(x)k(y). (37) 


By (36) and (37), k is a weight with the additional “normal” properties (37) and 


1 = k(0) < k(x). (38) 


(k(0) = 1 by (35); then by (37) and (36), for all r = 1,2,..., 


1=k(0) =k(ra—ra) < k(x)"k(-ra) < k(a)"(1+ Cria|)™; 


taking the rth root and letting r — oo, we get 1 < k(x).) 
Given a weight k and t > 0, define 


ki(a) = sup k(a — y)exp(—tlyl) = sup exp(—t|x — y|)k(y). 


(x,y range in R”.) 
We have 


k(x) <kY(a) < sunt? + Cly|) k(x) exp(—tly|) < Crk(x), 


that is, 


516 Application II Distributions 


where C; is a constant depending on t. 
Also 


k*(a +2’) = sup k(x + 2’ — y) exp(—tly|) 
y 


< sup(1+ Cla!l)""*(@ — 9) exp(—tlyl) = (1+ Cla"I)"*"(2), 


that is, k* is a weight (with the constants C,m of k, whence independent of t). 
By the last inequality, 
(1<)k*(x) < (1+ C|a|)”. 


Since 


k'(a+a’) = sup exp(—t|a+a’—y|)k(y) < etl?! sup exp(—t|x—y|)k(y) = efl#'l Kt (x), 
y y 


therefore, 
(1<)kY(x) < ell. 


In particular, ké > 1 as t > 0+, uniformly on compact subsets of R”. 

A weight associated with the differential operator P(D) (for any polynomial 
P on R”) is defined by 
1/2 


kp = ; (39) 


POP 


where the (finite) sum extends over all multi-indices a. The estimate (32) follows 
from Taylor’s formula and Schwarz’s inequality: 


2 


kp(e@+y)= >> 


a 


Per ore 


[B|<m 


< So |POH (yy? S> [x8 /At? 
a B 


[B|<m 


< kp(y)(1+ Cle)”, 


where m = deg P. 
Extending the sum in (39) over multi-indices a 4 0 only, we get a weight k’> 
(same verification!), that will also play a role in the sequel. 


II.3.8. Weighted L?-spaces. 
For any (temperate) weight & and p € [1, o«], consider the normed space 


Dpw = (1/k)L? = {fi kf © L?} 


(where L? := L?(IR")), with the natural norm ||f\[z,, = ||Kfllp(/| - |[p denotes 
the L?-norm). One verifies easily that Lp, is a Banach space for all p € [0, oo], 
and (Lp,~)* is isomorphic and isometric to Ly1/, for 1 < p < oo (q denotes the 


II.3. Temperate distributions 517 


conjugate exponent of p): if A € (Lp,x)*, there exists a unique g € Lg1/% such 
that 


Af = y) fody (f € Lyx) 


and 
All = Igllng aye 


By (34), S C Lp,x topologically. 
Given f € Lyx, Holder’s inequality shows that 


Jos 


Since S C Ly 1/;, topologically, it follows from (40) that the map ¢ > f of dx 
is continuous on S, and belongs therefore to S’. With the usual identification, 
this means that Lp, C S’, and it follows also from (40) that the inclusion is 
topological (if f; +0 in Lyx, then [ df;,dx — 0 by (40), for all ¢ € S, that is, 
f; + 0 in S’). We showed therefore that 


SMfllz, lela. (PS). (40) 


SCLIp~, cS’ (41) 
topologically. 


II.3.9. The spaces W) x. 
Let 
F:uéS' > Fu:= (Qn)-"PH S’, 


and consider the normed space (for p,k given as before) 
Wp,k = F Lyn = {u € 8’; Fu € Ly} (42) 
with the norm 
lle = Falla, = AF ullp (ue Wye). (43) 


Note that for any t > 0, 
Wh, k Fl Wo, kt 


(because of the inequality 1 < k’/k < C4, cf. 11.3.7). 

By definition, F : Wp, — Lp,x is a (linear) surjective isometry, and therefore 
W,,x is a Banach space. Since F~' is a continuous automorphism of both S and 
S’, it follows from (41) in II.3.8 (and the said isometry) that 


SCWpk CS’ (44) 


topologically. 

Fix @ as in II.1.1, and consider the regularizations u, = u* ¢, € & of 
ue Wor, p < co. Asr > 04+, k(x)t, (x) = k(x)ti(x) d(rx) > k(x)u(x) pointwise, 
and |kii,| < |ki| € L?; therefore, ki, > ki in L?’, that is, up > u in W,,~. One 
verifies easily that u, € S, and consequently, S is dense in W, x. Since D is dense 
in S, and S is topologically included in W,,, we conclude that D is dense in 


518 Application II Distributions 


W>p,k (and so Wp, is the completion of D with respect to the norm || - ||p,«) for 
any 1 < p < oo and any weight k. The special space W2,, is called Sobolev’s 
space, and is usually denoted 1°. 

Let L € W5 , (for some 1 < p < ov). Since F is a linear isometry of W,,x onto 
Ly, the map A = LF —! is a continuous linear functional on Ly, with norm 
||Z||. By the preceding characterization of L* ,,, there exists a unique g € Lg 1/, 
such that ||g|lz,..,, = ||L|| and Af = f fg da for all f € Lyx. Define v = F~'g. 
For all u€ Wy x, denoting f = Fu(€ Lyx), we have ||vl|q1/% = || LI] and 


hao= LF pS Ag = [euro de. (45) 


The continuous functional L is uniquely determined by its restriction to the 
dense subspace S of W,.~. For u€ S, we may write (45) in the form 


Lu = v(F2u) = v(Ju) = (Jv)(u) (w€ S§), (46) 


that is, L|s = Jv. Conversely, any v € W41/, determines through (45) an 
element L € W;,,, such that ||L|| = ||vllqa/x (and L|s = Jv). We conclude that 
Wy, $8 tsometrically isomorphic with W,,1/,. In particular, (*)* is isometrically 
isomorphic with H7*. 

If the distribution u € Wp, has compact support, and v € Wx, then by 
Theorem II.3.5, u* vu € S’ and 


Ju ollpane = [lk F (ux v)|lp = (20)"/?||(kKFu) (ki Fo)|p 
S (21)"/? ||kFullp||k’ Fulloo = (27)"/ |lullp ellelloo,e. 


In particular, 
(Wp, k M &) * Waeo,k! Gc Wh, kk! + (47) 


Let P be any polynomial on R”. We have P(D)6 € E€’ C S’, and 
FP(D)6 = P(M)F6 = (2n)-"/?P(M)1 = (2n)-"/?P. 


Thus 
|P(D)65lloo,n = (20)? ||k’Plloo < 00, 


for any weight k’ such that k’ P is bounded (we may take for example k’ = 1/kp, 
or k' =k, with s < —m, where m = deg P). 

Thus, P(D)d € Waoo,4 (for such k’), and for any u € Wyx,P(D)u = 
(P(D)6) KUE Wh, kk! + 

Formally stated, for any weight k’ such that k' P is bounded, 


P(D)Wp,k Cc Wh, kk! « (48) 
The shown calculations also demonstrate that 


|| P(D)ullpax < ||k'Plloolltllp.x (uw € Wp,x), (49) 


II.3. Temperate distributions 519 


that is, P(D) is a continuous (linear) map of W,,, into Wp KK. 

If u € Wp and ¢ € D, then & € Lp, and ¢ € S C Ly1/,, so that the 
convolution bx au makes sense as a usual integral. On the other hand, ¢u is 
well defined and belongs to S’. Using Theorem II.3.5 and the Fourier inversion 
formula on S’, we see that 


(2n)"/?F (pu) = (Fo) * (Fu). 
Hence (since k(a) < k(x — y)k(y)), 


(2n)"/?||bullp.n = ||k(F) * (Fu)llp 
< ||(AlF 9) * (klFul)ilp < AF Olli A Fully, 


that is, 
[|oullp.e < (20)-”/? (bl allele (50) 


Since D is dense in S and S C W.x, it follows from (50) that the multiplication 
operator ¢ € D > du € Wp» extends uniquely as an operator from S to Wp, % 
(same notation!), that is, SWpx% C Wp, and (50) is valid for all ¢ € S and 
Wwe Wp, k: 

Apply (50) to the weights k’ associated with k (cf. II.3.7). We have (for any 
peS) 


lolhae = f \kFo| de 


As t + 0+, the integrand converges pointwise to F¢, and are dominated by 
(1+ Clz|)™|F¢| € Lt (with C,m independent of t). By Lebesgue’s dominated 
convergence theorem, the integral tends to ||F@||1 := ||¢ll11. There exists 
therefore to > 0 (depending on ¢) such that ||@J|1,4¢ < 2\|@l]1,1 for all t < to. 
Hence by (50) 


l|Sullpae S 220) —"/? Illa alleullp.e (51) 


for allO <t <tp,@ € S, and u € Wyo x = Wp, xt, with to depending on ¢. This 
inequality is used in the proof of Theorem II.7.2. 

Let j be a non-negative integer. If |y|? € Lq1/, for some weight k and some 
1<q<o, then y® € Lg1/, for all multi-indices a with |a| < j. Consequently, 
for any u € W,,% (with p conjugate to q), y°a(y) € L’. By the Fourier inversion 
formula, u(x) = (27)~" f e*”-%a(y) dy, and the integrals obtained by formal 
differentiations under the integral sign up to the order j converge absolutely and 
uniformly, and are equal therefore to the classical derivatives of u. In particular, 
u € C!. This shows that 


Wh,k C ci (52) 


if |y|? € Lai/,- This is a regularity property of the distributions in W,,x. 


520 Application II Distributions 


IIl.4 Fundamental solutions 


II.4.1. Let P be a polynomial on R”. A fundamental solution for the (partial) 
differential operator P(D) is a distribution v on R” such that 


P(D)v = 6. (1) 
For any f € €’, the (well-defined) distribution u := v * f is then a solution of 
the (partial) differential equation 
P(D)u= f. (2) 
(Indeed, P(D)u = (P(D)v) x f =6* f = f.) The identity 
P(D)(v*u) =v*(P(D)u)=u (we E’) (3) 
means that the map V : u € €’ > v «wis the inverse of the map P(D) : €’ > €’. 


Theorem II.4.2 (Ehrenpreis-Malgrange-Hormander). Let P be a 
polynomial on R”, and « > 0. Then there exists a fundamental solution v for 
P(D) such that 
sech(e|x|)u © Woo,kp; 

and ||sech(e|z|)u|loo,np ts bounded by a constant depending only on €,n, and 
m = deg P. 

Note that sech(e]xz|) € € (since cosh(e|z|) = 0, 8 (a7 + --- + 22)*/(2k)!), 
and therefore its product with the distribution v is well defined. 

For any ~ € D (and v as in the theorem), write 


wu = [w cosh(e|x])][sech(e|z])v]. 


The function in the first square brackets belongs to D; the distribution in 
the second square brackets belongs to Wao,4»- Hence wu € Wao,np by (50) in 
Section II.3.9. Denoting 


wire = {ue D’; yu € Wy» for all y € D}, 


the above observation means that the operator P(D) has a fundamental solution 
The basic estimate needed for the proof of the theorem is stated in the 
following. 


Lemma II.4.3 (notation as in Theorem II.4.2.). There exists a constant 
C >0 (depending only on €,n, and m) such that, for allu € D, 


|u(0)| < Cl] cosh(e|a])P(D)ulliijep- 


Proof of Theorem II.4.2. Assuming the lemma, we proceed with the proof 
of the theorem. Consider the linear functional 


w:P(D)u->u(0) (ue D). (4) 


I.4.. Fundamental solutions 521 


By the lemma and the Hahn—Banach theorem, w extends as a continuous linear 
functional on D such that 


|w(e)| < Cll cosh(e|a])ollisser (@ € D). (5) 


Since D C W,.1/z¢p topologically, it follows that w is continuous on D, that is, 
w € D’. By (5), 


|[sech(«]x])w]($)| = Jw(sech(€|a]))| < Cllollaayae 


for all @ € D. Since D is dense in W1/z,, the distribution sech(e|z|])w extends 
uniquely to a continuous linear functional on W;,1/,, with norm < C. Therefore 


sech(e|z])w € Weo,kp (6) 


and 
Isech(€l2|)wlloory $C. (7) 


Define v = Jw := w. Then the distribution sech(e|x|)v € Weo,kp has || - |lokp- 
norm < C (the constant in the lemma), and for all ¢ € D, we have by (4) 


(P(D)v)($) = [(P(D)e) * 4](0) = (v * P(D)¢4)(0) 


that is, P(D)v = 6. 
Proof of Lemma II.4.3. (1) Let p be a monic polynomial of degree m in 


one complex variable, say p(z) = Sco a;21,Qm = 1. The polynomial q(z) = 
io Gz"? satisfies q(0) = 1 and 


ala 0; eit 
J 


If f is analytic on the closed unit disc, it follows from Cauchy’s formula applied 
to the function fq that 


late")| = = |p(e")I. 


FO = (FOO) < 5 f Irealeylae= =f rleple"WIat. 8) 


Writing p(z) = [[ji1(z + 2;), we have for k <m 


MPa= oe LY TT cra © 


m1 no¢{ni} — ne€{n1,....re-1} JE{N1,--.NK} 


where all indices range in {1,...,m}. 


522 Application II Distributions 


Using (8) with the analytic function 


and the polynomial 


we obtain 


fO TT alsa fit ype ar 


Since the number of summands in (9) is m(m—1)---(m—k+1) = m!/(m — k)!, 
it follows that 


FOO) s IE [le vtel9]at (10) 


Since (10) remains valid when p is replaced by cp with c 4 0 complex, the 
inequality (10) is true for any polynomial p and any function f analytic on the 
closed unit disc. 

(2) If f is entire, we apply (10) to the function f(rz) and the polynomial 
p(rz) for each r > 0. Then 


m! 


FO (Oamr* < — TL is(re")p(re") at. (11) 


Let g be a non-negative function with compact support, integrable with respect 
to Lebesgue measure on C, and depending only on |z|. We multiply (11) by 
rg(re'’) and integrate with respect to r over [0,00). Thus, 


0) f tears ef itepeloed (2) 


where dz = rdr dt is the area measure in C. 

(3) The n-dimensional version of (12) is obtained by applying (12) “one 
variable at a time”: let f be an entire function on C”, p a polynomial on C”, 
and g a non-negative function with compact support, integrable with respect to 
Lebesgue measure dz on C”, and depending only on |z;|,...,|2n|. Then 


\p\™ (0 ge es z)p(z)|g(z) dz. 
olf bles oe f eeelaa. (3) 


(4) Let u € D, fix y € R”, and apply (13) to the entire function f(z) = 
a(y + z), the polynomial p(z) = P(y +z), and the function g equal to the 
indicator of the ball B := {z € C";|z| < ae Then 


y) Py | [ "Ide < oy af y+z)P(y + z)| dz. 


IL.5. Solution in €' 523 


Therefore, 


(Is SO [PO (Wllaw) 


ja|<m 


ZG / Ge DPOs alee Cxanyrl? \(FP(D)u)(y + z)| dz 


= C1 (20 ae eal |F le" *P(D)ul(y)| dz, 


where C is a constant depending only on m and n. Denote k := 1/kp. Then 


[way 


<a. ff muir PD)ul)| deay 


(2n)"/?|u(0)| = (20)-"? 


= a | le P(D)ulli.n dz < Cy|BI Sup le” P(D)ullie, 
B ZzE 


where |B| denotes the C”-Lebesgue measure of B (depends only on n and e). 

For z € B, all derivatives of the function ¢,(x) := e~!*-*/cosh(e|z|) are 
O(e~*l*!/2), and therefore the family ¢p := {¢.;z € B} is bounded in S (this 
means that given any zero neighborhood U in S, there exists 7 > 0 such that 
Ade C U for all scalars 4 with modulus < 7. For a topological vector space with 
topology induced by a family of semi-norms, the above condition is equivalent 
to the boundedness of the semi-norms of the family on the set @g; thus, in the 
special case of S, the boundedness of ¢g means that sup,¢z ||@z|la,8 < 00). Since 
S CW, topologically (for any p,k), it follows that the set ¢g is bounded in 
W)p,r for any p,k. In particular, Mg := sup,¢p ||@z||1,4 < 00. (Mg depends only 
on n and e.) By (50) in 11.3.9, 


|u(0)| < Ci| By sup (2n)~"/?||$-[eosh(elx|)P(D)u] lhe 


< (20) "Ci |B BUD Il Pzll1,4|] cosh(ela]) P(D)ulli.. = Cl] cosh(e]a|)P(D)ulla«, 


where C' = (27)~"C;|B|Mg depends only on n,m and e. 


II.5 Solution in €’ 


II.5.1. Consider the operator P(D) restricted to E’. We look for necessary 
and sufficient conditions on f € €’ such that the equation P(D)u = f has a 
solution u € €’. A necessary condition is immediate. For any solution ¢ € € of 
the so-called homogeneous “adjoint” equation P(—D)¢ = 0, we have (for wu as 
shown) 


524 Application II Distributions 


that is, f annihilates the null space of P(—D)|¢. In particular, f annihilates 
elements of the null space of the special form $(x) = q(x)e'-*, where z € C” 
and gq is a polynomial on R” (let us call such elements “exponential solutions” 
of the homogeneous adjoint equation). Theorem II.5.2 establishes that the later 
condition is also sufficient. 


Theorem II.5.2. The following statements are equivalent for f € E': 


(1) The equation P(D)u = f has a solution u € E’. 


(2) f annihilates every exponential solution of the homogeneous adjoint 
equation. 


(3) The function F(z) := f(z)/P(z) is entire on C”. 


Proof. 1 ==> 2. See II.5.1. 
2 ==> 3. In order to make one-complex-variable arguments, we consider the 
function 


f(tw +z) 
P(tw + z) 


Let Pi, be the principal part of P, and fiz w € C” such that P,,(w) 4 0. Since 


F(t;z,w) := (t€C;z,w EC”). (1) 


P(tw + z) = Py (tw + z) + terms of lower degree 
= SS P\) (tw)z°/a! + terms of lower degree 


a 


= P,,(tw) + terms of lower degree = t” P,,(w) + terms of lower degree, 


and P,,(w) 4 0, P(tw + z) is a polynomial of degree m in t (for each given z). 
Fix z = zo, and let to be a zero of order k of P(tw + 2). For 7 < k, set 


0; (a, t) = (a- wy exp(—iaz - (tw + zo)). (2) 


Then 
P(—D)¢,(z,t) = P(-D) (5) exp (—ix - (tw + z0)) 


7 (2) P(—D) exp(—iz - (tw + 20)) 


= (‘5,) P(tw + 29) exp(—ia - (tw + 20)), 


and therefore P(—D)@;(x, to) = 0, that is, ¢;(-,to) are exponential solutions of 
the homogeneous adjoint equation. By hypothesis, we then have f(¢;(-,to)) = 0 
for all 7 < k. This means that, for all 7 < k, 


ai ; 
Ot [esas f(tw + 20) = 9, 


I.6. Regularity of solutions 525 


that is, f(tw+zo) has a zero of order > k at t = to, and F(-; zo, w) is consequently 
entire (cf. (1)). 

Choose r > 0 such that P(tw + 20) # 0 on the circle T := {t;|t| = r} 
(this is possible since, as a polynomial in t, P(tw + zo) has finitely many zeros). 
By continuity, P(tw + z) 4 0 for all t on T and z in a neighborhood U of zo. 
Therefore, the function G(z) := 1/271 J, F(t; z, w) dt/t is analytic in U, that is, 
G is entire (by the arbitrariness of z9). However, by Cauchy’s integral theorem 
for the entire function F(-;z,w), we have G(z) = F(0;z,w) = f(z)/P(z), and 
Statement (3). is proved. 

3 = > 1. By Theorem II.3.6, it suffices to show that F' satisfies the estimate 
in Part (i) of Theorem II.3.6 (for then F is the Fourier—Laplace transform of 
some u € €’; restricting to R", we have therefore Ff = PFu = FP(D)u, hence 
P(D)u = f). 

Fix ¢ € C” and apply (13) in the proof of Lemma II.4.3 to the entire function 
f(z) = F(¢+2), the polynomial p(z) = P(¢ +z), and g the indicator of the unit 
ball B of C”. Then 


FOP OMI s ef IMC +2)ldz< O|B| sup |f6 + 2)I, 


where C is a constant depending only on n and m = deg P. Choose a such that 
P( is a non-zero constant. Then, by the necessity of the estimate in Part (i) of 
Theorem II.3.6 (applied to the distribution f € €’), we have 


IF(C)| < Cy sup(1 + |C + 2/)FeA SCH)! < OR(1 + [C})Pe4 IS, 
zE€B 


as desired. 


II.6 Regularity of solutions 


II.6.1. Let 9 be an open subset of R”. If ¢ € D(Q) and u € D’(Q), the product 
ou is a distribution with compact support in Q, and may then be considered as 
an element of €’ := €’(R”) C S’ := S'(R"). Set (for any weight k and p € [1, co]) 


Wyre (Q) = {u € D!(Q); du € Wye for all d € D(Q)}. 
(cf. comments following Theorem II.4.2.) Note that if u € €(Q), then gu € 
DQ) CS C Wy, for all ¢ € D(Q), that is, E(Q) C W)PE(Q) for all p, k. 
Conversely, if u € wire(Q) for all p,k (or even for some p and all weights k,), 
it follows from (52) in Section II.3.9 that u € €(Q). This observation gives an 
approach for proving regularity of distribution solutions of the equation P(D)u = 
f in Q (for suitable f): it would suffice to prove that the solutions u belong to 


all the spaces W/?r(Q) (since then u € E(Q)). 


526 Application II Distributions 


II.6.2. Hypoellipticity. 
The polynomial P (or the differential operator P(D)) is hypoelliptic if there 
exist constants Cc > 0 such that 


| P(x) 


a peel (1) 


as x € R” > o, for all multi-indices a F 0. 
Conditions equivalent to (1) are any one of the following conditions (2)—(4): 


ee a 
for all a £ 0; 
| lien dist(x, N(P)) =o, (3) 
where N(P) := {z € C”; P(z) = 0}; 
dist(a, N(P)) > C|a|° (4) 


as x € R” > o, for suitable positive constants C, c (these equivalent descriptions 
of hypoellipticity are not used in the sequel). For example, if the principal part 
Py, of P does not vanish for 0 4 x € R” (in this case, P and P(D) are said to 
be elliptic), Condition (2) is clearly satisfied; thus elliptic differential operators 
are hypoelliptic. 


Theorem IT.6.3. Let P be a hypoelliptic polynomial, and let Q be an open subset 
of R". Ifu€ D'(Q) ts a solution of the equation P(D)u = f with f € wire(Q), 
then u € Wretkp (Q). In particular, if f € E(Q), then u € E(Q). 

(The following converse is also true (proof omitted): suppose that for some Q, 

some p € [1,00], and some weight k, every solution of the equation P(D)u = 0 
in wire(Q) is in E(Q). Then P is hypoelliptic.) 
Proof. Fix w CC 9. For any u € D’(Q) and ¢ € D(w), we view du as an 
element of €’ with support in w (cf. II.6.1). By the necessity of the estimate in 
Part (i) of Theorem II.3.6), |F(¢u)(x)| < M(1 + |2|)" for some constants M,r 
independent of ¢. Hence, for any given p, there exists s (independent of ¢) such 
that k_,F (gu) € L?. Denote k’ = k_, for such an s (fixed from now on). Thus, 
gu € Wy,x for all ¢ € D(w), that is, uw € WIE, (w). 


The hypoellipticity condition (1) (Section II.6.2) implies the existence of a 
constant C’ > 0 such that |P‘/P| < (1/C’)|xz|-¢l@! for all a 4 0. Summing 
over all a 4 0 with Jal < m, we get that kp/|P| < (1/C”)(1 4+ |z|)~° for some 
constant C” > 0 (cf. notation at the end of II.3.7). Hence 


— > — > C"(14 |a|)°. (5) 


IL.6. Regularity of solutions 527 


Given the weight k, kkp/k’ is a weight, and therefore it is O((1+|z|)”) for some 
v. Consequently there exists a positive integer r (depending only on the ratio 
k/k’) such that kkp/k’ < const.(1 + |2|)°". By (5), it then follows that 


k Tr 

kkp < Ck’ (=) (6) 
kip 

for some constant C. 

Claim. [fk is any weight such that f € Wirt (w) and (6) with r = 1 is valid (that 


is, k < C(k'/k)), then any solution u € Wire (w) of the equation P(D)u = f 


. . . loc 
is necessarily in Wir), (W)- 


Proof of claim. We first observe that 


P(D)Wre(w) C Wert ep (w) (7) 


P, P 


(cf. 11.3.9, Relation (48)). Therefore, P(“)(D)u € Wirt! [ks (w) C Wie (w) for all 
a #0 (for u as in the claim, because k < C(k'/kp)). 
If 6 € D(w), we have by Leibnitz’s formula (II.2.9, (5)) 


P(D)(du) = of + S> D°@P™ (D)u/al. 
aA~0 


The first term is in W,,x by hypothesis. The sum over a ¥ 0 is in Wp, by the 
preceding observation. Hence P(D)(@u) € W,,x (and has compact support). 

Let v € VS ke (R”) be a fundamental solution for P(D) (by Theorem II.4.2 
and the observation following its statement). Then 


gu = 0% [P(D)($u)] © Workp 
since for any weights k, ky (cf. I1.3.9, (47)) 
Werk (R”) * [Wp ,k M & e Wh, kk « 


This concludes the proof of the claim. 
Suppose now that r > 1. Consider the weights 


k! Fi 
k= 8 (FE) j=0,...,r—1. 
P 


Since 
k=kop > ky > +++ > kp-1, 


we have f € Wik, (w) for all 7 = 0,...,7—1. Also by (6) 


k',\"—" k 
kpk,p_1 | —kpk( 2 ra 6) pea 
kp ki, 


528 Application II Distributions 


We may then apply the claim with the weight k,_; replacing k. Then u € 
Whee ok, (), f € WIE, (w), and kp_g = kpkp—1/k’p. By the claim with 
the weights k,k’ replaced by k,_2,kpk,_1 (respectively), it follows that u € 
Wrrte k,_» (W). Repeating this argument, we obtain finally (since ko = k) that u € 


Wirt pk (W)- This being true for any w CC Q, we conclude that u € Wirt pr (2). 


II.7 Variable coefficients 


II.7.1. The constant coefficients theory of II.4.1 and Theorem II.4.2 can be 
applied “locally” to linear differential operators P(#,D) with (locally) C°°- 
coefficients. (This means that P(z, y) is a polynomial in y € R”, with coefficients 
that are C°-functions of x in some neighborhood Q Cc R” of x°.) Denote 
Py = P(x®,-). We shall assume that there exist € > 0 and0 < M < oo such that 
the e-neighborhood V of x° is contained in Q and for allx € V 


Kk p(a,-) 


<M. (1) 


kp, 


The method described next regards the operator P(x, D) as a “perturbation” of 
the operator Po(D) for x in a “small” neighborhood of z°. 


Let r+ 1 be the (finite!) dimension of the space of polynomials Q such that 
kag/kp, is bounded, and choose a basis Po, P:,...,P, for this space. By (1), we 
have a unique representation 


P(z,+) = Po + SS C3 (x) P; (2) 


for all « € V. Necessarily c;(x°) = 0 (take x = 2°) and c; € C®(V). 
By Theorem II.4.2, we may choose a fundamental solution v € Wes for 
Po 


the operator P9(D). Fix y € D such that y = 1 in a 3eneighborhood of 2°. 
Then 
w= xv € Woo, kp,» (3) 


and for all h € €'(V), 
Po(D)(w * h) = w * (Po(D)h) = v * Po(D)h = h. (4) 
(The second equality follows from the fact that supp Po(D)h C V and w * g = 
vu * g for all g € E’(V).) By (2) and (4) 
P(x, D)(w h) =h-+ So e4()P)(D)(w *h) (5) 
j=l 


for all h € E’(V). 


II.7. Variable coefficients 529 


We localize to a suitable 6-neighborhood of x° by fixing some function ¢ € D 
such that ¢ = 1 for |z| < 1 and supp¢ C {a;|z| < 2}, and letting ¢5(x) = 
o((a — x°)/6). (Thus ¢5 = 1 for |x — 2°| < 6 and supp 5 C {a;|x — x°| < 26.) 

By (5), whenever 6 < € and he €’(V), 


P(., D)(w *h) = h + S> $5¢)P)(D)(w * h) (6) 


in |x —2°| <6. 


Claim. There exists 09 < €/2 such that, for 6 < do, the equation 
h+S° os¢;P;(D)(w *h) = bef (7) 
j=l 


(for any f € E’) has a unique solution h € E'. 


Assuming the claim, the solution h of (7) satisfies (by (6)) 
P(.,D)(w*h)=osf=f (8) 


in V5 := {a;|2—2x°| < 6}. (Since 26 < €, supp ¢5 C V, and therefore, supp h Cc V 
by (7), and (6) applies.) 
In other words, u = w*h € E' solves the equation P(-,D)u = f in V5. 
Equivalently, the map 
T:fE€ awxhe€ (9) 
(with h as in the “claim”) is “locally” a right inverse of the operator P(-, D), 
that is, 
P(a,D)Tf=f (f © €';2€ Vs). (10) 
The operator T is also a left inverse of P(-, D) (in the shown local sense). Indeed, 
given u € E’(Vs), we take f := P(-,D)u and h := Po(D)u. By (4), w*xh = 
w * Po(D)u = u (since u € E’(V)). Therefore, the left-hand side of (7) equals 


Po(D)ut >> b5cjP;(D)u = P(-, D)u= f = dsf 
J 
in Vs (since ¢5 = 1 in Vs). Thus “our” h is the (unique) solution of (7) (for 
“our” f) in Vs. Consequently 
TP(-,D)u:=w*eh=u (we E'(V5)) (11) 


in V5. Since 26 < €, (11) is true in Vs for all u € E’(R”). Modulo the “claim”, we 
proved the first part of the following. 


Theorem II.7.2. Let P(-,D) have C™-coefficients and satisfy 


kpce,.) < Mkp, (*) 


530 Application II Distributions 


in an €-neighborhood of x° (where M is a constant and Py := P(x°,-)). Then 
there exists a 6-neighborhood Vs of x° (with 6 < €) and a linear map T : €' + E' 
such that 

P(.,D)Tg=TP(.,D)g=g9 in V5 (gE &’). 


Moreover, the restriction of T to the subspace Wyn ME’ of Wp, is a bounded 
operator into Wy, kkp,, for any weight k. 


Proof. We first prove the “claim” (this will complete the proof of the first part 
of the theorem.) 
For any 6 < €, consider the map 


S5: RES! + S— osc; P;(D)(w * h). 
j=l 
Since w € Wo,kp, and kp, /kp, ave bounded (by definition of P;), we have 
|P;Fw| < kp,|Fw| < const -kp,|Fuwl|<C<oo (7 =0,...,r). (12) 


Let k be any given weight. By (51) in Section II.3.9, there exists to > 0 such 
that, for 0 < t < to, 


[SsPllp,ee < 2(2r)—"/? S * [lGseyllaa||Pj(D)(w * A)lp,ee- (13) 
j=l 
By the inequality preceding (47) in II.3.9, 
|| Pj(D)(w * h)llp,ae = ||[Pj(D)w] * hllp,xe < (2m)"/? || Pj(D)w] 00,1 lA y,x# 
for all h € Wyk = Wp,xt- Since by (12) 
I|P;(D)wlloo,1 = FP) (P)wllloo = ||Pj-Fulloo S C 


(for j =1,...,1r), we obtain from (13) 


[|SsAllp,ae S 2C ST Ibscylla,rllPllp,xe (14) 


fE1 


for all h € Wp... 

Since c;(x°) = 0, c; = O(6) on supp ¢5 by the mean value inequality. Using 
the definition of ¢5, it follows that c;D°¢5 = O(6'~!*!). Hence by Leibnitz’s 
formula, D*(¢5c;) = O(6!~'¢!). Therefore, since the measure of supp(sc;) is 
O(6”), we have 


a°F (bsc;) = F[D*(sc;)] = O(6"* 11). 


Hence 
(1+ d|a|)"**F(sc;) = O(6"*"). 


II.8. Convolution operators 531 


Consequently 
= dx 
lose; laa = ij |F (@sc;)| dx < const -6 i ees — O(6). 
We may then choose 0 < 69 < €/2 such that 
- 1 
Y ligseillia < Zz (15) 
j=l 


for 0 < 6 < do. By (14), we then have 
I|Sshllp.ne S (1/2) |IPllp,xe 


for all h € Wy.k = Wpy,xt- This means that for 6 < d9, the operator Ss on the 
Banach space W, ,¢ has norm <1/2, and therefore J + $5 has a bounded inverse 
(I is the identity operator). Thus, (7) (in II.7.1) has a unique solution h € Wp.x 
for each f € Wy. By the equation, h is necessarily in €’ (since ¢5 € D). If f 
is an arbitrary distribution in €’, the trivial part of the Paley-Wiener—Schwartz 
theorem (II.3.6) shows that f(x) = O(1+ ||)" for some N, and therefore, 
f € Wp,x for suitable weight k (e.g., k = k_, with s large enough). Therefore 
(for 6 < 69) there exists a solution h of (7) in Wp... €’. The solution is unique 
(in €’), because if h,h’ € €’ are solutions, there exists a weight k such that 
f,h,h’ © W,x, and therefore, h = h’ by the uniqueness of the solution in W, x. 
This completes the proof of the claim. 

Since ||.S5|| < 1/2 (the norm is the B(W,,,+)-norm!), we have ||(I+$5)~"|| < 
2 (by the Neumann expansion of the resolvent!), and therefore, ||h||pn¢ < 
2\|Osf\lp.4e. Consequently (with h related to f as in the “claim”, and t small 
enough), we have by the inequality preceding (47) and by (51) in II.3.9: 


IT Flo deget = lw * Allprenget S 2m)" || looker Mallet 
S 2(2r)"/? |[wl]co,kerg ll PS fllp.kt S Allelloo,key sll al fllp.ae- 
This proves the second part of the theorem, since the norms || - |p, (| + |lp.ep,%) 
and || - ||p,n+ (Il - Ilp.ap,xt» Tespectively) are equivalent. 


Corollary II.7.3. For P(-,D) and Vs as in Theorem II.7.2, the equation 
P(.,D)u =f has a solution u€ C™®(Vs) for each f € C*(R”). 


Proof. Fix ¢ € D such that ¢ = 1 in a neighborhood of V5. For f € C™, 
Of © Woe NE’ for all k; therefore, u:= T(¢f) is a solution of P(., D)u = f in 
V5 (because ¢f = f in Vs), which belongs to W, 4%), for all weights k, hence 
We C™ (V5). 


II.8 Convolution operators 


Let h : R” — C be locally Lebesgue integrable, and consider the convolution 
operator 
T:urhxu, 


532 Application II Distributions 


originally defined on the space L} of integrable functions u on R” with compact 
support. 
We set hé(x) := t"h(tx), and make the following. 


Hypothesis I. 

[he v)-N@lde SK <co (yl <1 t> 0), (1) 

|a|>2 
Lemma II.8.1. [fu € Li vanishes outside the ball B(a,t) and f udx =0, then 
| |Tul dx < K]lulla. 
B(a,2t)¢ 
Proof. Denote wa(x) = u(x +a). Since (Tu)q = Tua, we have 
| IPulde = | \(Tu)a|ae = f |h x Ua| dx 
B(a,2t)< B(0,2t)< B(0,2t)< 


= | |hé * (ta)*| dz. 
B(0,2)¢ 


Since (ua)’(y) = t?u(ty + a) = 0 for |y| > 1 and f(ua)*(y) dy = f u(x) dx = 0, 
the last integral is equal to 


— 


< [ i |nt (a — y) — ht(a)| [(tha)’(y)| dy dex 
B(0,2)¢ J |y|<1 


= ii ( | (29) - Moa) \(tta)*(y)| dy < K||(a)*Ih = Ke 
ly|<1 B(0,2)¢ 


dx 


i} le Mela)" a 


We shall need the following version of the Calderon—Zygmund decomposition 
lemma. 


Lemma II.8.2. Fiz s > 0, and let u € L'(IR"). Then there exist disjoint open 
(hyper)cubes I, and functions uz,v € Li(R") (k EN), such that 

(1) uz vanishes outside I, and [uz dx =0 for all k € N; 

(2) \v| < 2"8 a.e.; 

(3) w= U+ DI, Uni 

(4) lola + doe lualla S Sllullas and 

(5) SO) Lel < |lulla/s (where |Ip| denotes the volume of Ij). 


II.8. Convolution operators 533 


Proof. We first partition R” into cubes of volume >||u||1/s. For any such cube 
Q, the average on Q of |ul, 


Ag (lul) = |Q|7? [ lul dx, 


satisfies 

Ag(|ul) < ]Q\-* [lull < s. (2) 
Subdivide Q into 2” congruent subcubes Q; (by dividing each side of Q into two 
equal intervals). If Ag, (|u|) > s for all i, then 


Aatla) = 1S f) lulz les 1A =» 


contradicting (2). Let Q1,; be the open subcubes of @ on which the averages of 
|u| are > s, and let Q{, be the remaining subcubes (there is at least one subcube 
of the latter kind). We have 


luis [lula f ular <s1Q|=2"s1Qrg (3) 
19. Q 

We define v on the the cubes Q;,; as the constant Ag, ,(u) (for each j), and we 

let u1,; be equal to u— v on Q),; and to zero on Qf ;. 

For each cube Q{, we repeat the construction we did with Q (since the 
average of |u| over such cubes is < s, as it was over Q). We obtain the open 
subcubes Qo,; (of the cubes Q‘,) on which the average of |u| is > s, and 
the remaining subcubes Q)) on which the average is <s. We then extend the 
definition of v to the subcubes Qo; by assigning to v the constant value Ag, , (u) 
on Q2,; (for each 7). The functions u2,; are then defined in the same manner as 
U1,95 with Qo; replacing Q1,5- 

Continuing this process (and renaming) we obtain a sequence of mutually 
disjoint open cubes J;, a sequence of measurable functions ux defined on R”, 
and a measurable function v defined on 2 := U Jy (which we extend to R” by 
setting v = u on 2°). By construction, Property 3 is satisfied. 

Since the average of |u| on each I; is > s (by definition), we have 


sll f lular = f ful de < fal, 


and Property 5 is satisfied. 
If « € Q, then x € J; for precisely one k, and therefore 


Iu(2)| = |An,(u)| < An (Jul) <2" 


by (3) (which is true for all the cubes I,, by construction). If « ¢ Q, there is 
a sequence of open cubes J; containing x, over which the average of |u| is < s, 
such that |J,| — 0. This implies that |u(x)| < s a.e. on 9°, and since v = u on 
Q°, we conclude that v has Property 2. 


534 Application II Distributions 


By construction, uz vanishes outside J, and f uz dz = Sie udx — Sie vdxz = 0 
for all k € N (Property 1). 
Since J; are mutually disjoint and supp uz C Ip, we have 


loll + Solluel = fh lolde+ > fol + hae. 


However v = u on 1% and uz = u—v on I; therefore, the right-hand side is 


ei julde + (2 f jolaz + f lula). 
Qe 7 Tp Tp 


Since v has the constant value A;,(u) on Ip, 


| lu| @e = |Ar, (u)| [Eel < | ll er, 
Ip Ik 


and Property 4 follows. 


Consider now u € L1(R”) with compact support and ||ul|, = 1. It follows from 
the construction in the last proof that v has compact support; by Property 1 
in Lemma II.8.2, uz have compact support as well, for all &. Therefore, Tv and 
Tux are well defined (for all &), and 


Tu=Tvt+ ys Tur. (4) 
k 
For any r > 0, we then have 
[|[Pu] > r] ¢ [|Tv] > r/2]U bs |Tug| > r/2). (5) 


Denote the sets in the last union by F;. and G,.. 

Let B(ax,t,) be the smallest ball containing the cube J, and let c, be the 
ratio of their volumes (depends only on the dimension n of R”). Since uz vanishes 
outside B(ax, tx) and fu, dz = 0, we have by Lemma IL.8.1 


i, (Pug| dz < Kluelh. (6) 
B(ax ,2th)© 


Let 


Then (for s > 0 given as in Lemma IL.8.2) 


JE] < $0 |B(ax, 2te)| = 2" )0 | Bax, te)| 


k k 


= ee} |In| < 2" cn /s. (7) 


k 


II.8. Convolution operators 535 


Therefore 


IG,| = |G-N ES| + |G-N E| < |G, N ES| + |£| 
ZNO KE Ot eas (8) 


Since E° C B(ax, 2t,)° for all k, we have by (6) 
| |\Tu,|dx < K]lug|ly (kK =1,2,...), 
Ee 


and therefore 


IG, NE < (2/r) [ Situlae = any f \Tup| dex 
< (2/r)K S© |luglla < (6/r)K 


by Property 4 of the functions ux (cf. Lemma II.8.2). We then conclude from (8) 
that 
IG,.| < 6K /r + 2"c,/s. (9) 


In order to get an estimate for |F,|, we make the following. 


Hypothesis IT. 
IIT 4|l2< Clldlle (6 € D), (10) 


for some finite constant C' > 0. 
Since v is bounded a.e. (Property 2 in Lemma II.8.2) with compact support, 
it belongs to L?, and it follows from (10) and the density of D in L? that 


Tull < C?llulld < C?llvloollull, < 3C72"s, (11) 
where Properties 2 and 4 in Lemma II.8.2 were used. Therefore, 
[Fo] < 4/r*)|[Polfg < 12 C72" s/r°, 
and we conclude from (5) and (9) that 
|[|Tul > r]| < 6K /r + 2"cn/s + 12C72"s/r?. 


The left-hand side being independent of s > 0, we may minimize the right-hand 
side with respect to s; hence, 


[[|Zu] > r]] < C'/r, (12) 


where C’ := 6K + 2”"*?,/3c,,C depends linearly on the constants K and C of 
the hypothesis (1) and (10) (and on the dimension n). 

If u € L' (with compact support) is not necessarily normalized, we consider 
w =u/|lull1 (when |/ul|1 > 0). Then by (12) for w, 


[[|Pu] > r}] = [[Pew] > r/lellall < C'lulla/r. (13) 


536 Application II Distributions 


Since (13) is trivial when ||u||; = 0, we proved the following. 


Lemma II.8.3. Let h : R” — C be locally Lebesgue integrable and satisfy 
Hypotheses I and II (where T denotes the convolution operator T:u > h*u, 
originally defined on Li). Then there exists a positive constant C’ depending 
linearly on the constants K and C of the hypothesis (and on the dimension n) 
such that 


[[|Pu] > r]| < Clulla/r  (r > 0) 
for allué€ Li. 


Under the hypothesis of Lemma II.8.3, the linear operator T is of weak type 
(1,1) with weak (1,1)-norm < C’, and of strong (hence weak) type (2,2) with 
strong (hence weak) (2,2)-norm < C < C’. By Theorem 5.41, it follows that T 
is of strong type (p,p) with strong (p,p)-norm < A,C’, for any p in the interval 
1 < p < 2, where A, depends only on p (and is bounded when p is bounded 
away from 1). 

Let h(a) := h(—a), and let T be the corresponding convolution operator. 
Since h satisfies hypotheses (1) and (10) (when h does) with the same constants 
K and C, the operator T is of strong type (p, p) with strong (p,p)-norm < A,C” 
for all p € (1, 2]. Let q be the conjugate exponent of p (for a given p € (1, 2]). 
Then for all u € D, 


Tula = sup {| eae ve D,|lvllp = i} 


=sup| ffm Rae Cree 
=sup| [a y — x)v(x) dx u(y) ay =sup| f Pouca) 


< sup ||Tv|lp [lullq < ApC’ llulla. 


Thus, T is of strong type (q,q) with strong (q,q)-norm < A,CO’. Since q varies 
over the interval [2,00) when p varies in (1, 2], we conclude that T is of strong 
type (p,p) with strong (p,p)-norm < A,C” for all p € (1,00) (A, = Ap for 

€ (1, 2] and Al, = A, for p € [2,00), where p’ is the conjugate exponent of p). 
Observe that Aj, is a bounded function of p in any compact subset of (1,00). We 
proved the following. 


Theorem II.8.4 (Hormander). Let h : R” > C be locally integrable, and 
let T:u—> hxu be the corresponding convolution operator (originally defined 
on Li). Assume Hypotheses I and II. Then for all p € (1,00), T is of strong 
type (p,p) with strong (p,p)-norm < ApC", where the constant Ap depends only 
on p and is a bounded function of p in any compact subset of (1,00), and 
C’ depends linearly on the constants K and C of the hypothesis (and on the 
dimension n). 


II.8. Convolution operators 537 


In order to apply Theorem II.8.4 to some special convolution operators, we 
need the following. 


Lemma II.8.5. Let S := {y € R";1/2 < |y| < 2}. There exists 6 € D(S) with 
range in [0,1] such that 


S> o(2-*y) =1 (y eR” — {0}). 


kEZ 
(For each 0 #4 y € R”, at most two summands of the series are # 0.) 


Proof. Fix w € D(S) such that w(x) = 1 for 3/4 < |z| < 3/2. Let y € R” — {0} 
and k € Z. If y(2-*y) £0, then 2-*y € S, that is, log, |y|—1 < k < logy |y| +1; 
there are at most two values of the integer k in that range. Moreover, W(27*y) = 1 
if 3/4 < 2-*ly| < 3/2, that is, if logy (|y|/3) +1 < k < logs (|y|/3) +2, there is at 
least one value of & in this range. It follows that 


1< >> ¥(2-*y) < 00 


kez 


(there are at most two non-zero terms in the series, all terms are > 0, and at 
least one term equals 1). 
Define 
vy) 


OU) = eae 
Ww) Dyer V(24y) 
Then ¢ € D(S) has range in [0, 1] and for each y 4 0 


AC) eC) 
Bertil jez W(Q-I-ky) yez W(2-Iy)’ 


hence, )>,ez $(2-*y) = 1. 


The following discussion is restricted to the case n = 1 for simplicity (a similar 
analysis can be done in the general case). Fix ¢ as in Lemma II.8.5; let c := 
max(1,sup|¢’|), and denote ¢,(y) := ¢(27"y) for k € Z. 

Let f be a measurable complex function, locally square integrable on R. 
Denote f, := f¢éx. Then 


supp fr C 2° supp ¢ c 2*S, 


f=) fe on R— {0}, 


keZ 


and |fi,| < |fI- 
Let I;, denote the indicator of the set 2°.9. Then |fx| = |frZu| < |fIg|, and 
therefore, 


II fell2 < Wl fZell2 < 00. (14) 


538 Application II Distributions 
Consider f and f;, as distributions, and suppose that Df (in distribution sense) 
is a locally square integrable (measurable) function. Since 
ID fal < |Dflox +2-* sup |¢'| |f] < (IDF +2“ | FI), 
it follows that 
||D fille = l(Dfa)Lell2 < e(I\(Df)Lall2 +2 "|| f Zallz) 
= 282-8 FI lla + 2°? ||(Df Lela: (15) 


Notation. We denote by H the space of all measurable complex functions f on 
R, locally square integrable on R, for which 


Il fllee = sup[2-*/? II fTelle + 2*/7||(Df) Tella] < 00. 


Assume f € H. It follows from (14) and (15) that 
Ilfell2 < 2*/?|[flle and ||Dfall2 < c27*/*|| fll (16) 


for all k € Z. 

Let gx = F~' f,. Since F~! is isometric on L? and F~!'D = —MF—! (where 
M denotes the operator of multiplication by the independent variable), we have 
by (16): 


[ (1 + 282?) |g4 (0)? de = llaul2 + 2 I\(—M)gul2 


= [fell + 2”°||D fells < 2° +e) fll = 7) 


Therefore, by Schwarz’s inequality 


IIgell = fo + 27h?) M1 + 27a?)¥/|g,.(cr)|] dex 


dx vhs 2k 2 2 
< —— 1+2 d 
<(f See) (f0+2"?la@Pae) 
2* da Be 
< V14+e?|f lly f aes) =¢'||fllu, 
where c! = ,/a(1+c?). Hence 


| fl = |Fgel < lIgella < ell fll (K€ Z). (18) 


Consider now the “partial sums” 


1/2 


and let ng t= Fo ey Siem Gk: 


II.8. Convolution operators 539 


Since at most two summands f; are 4 0 at each point y 4 0 (and s,,(0) = 0), 
we have by (18) 
lSm| < 2c'||flle, (m= 1,2,...). (19) 


Therefore, for all y) € S:= S(R) andmeN, 
|[Pm * Blo = ||F(Rm * b)ll2 = V27||FhmF ello 
= V2M|[5m Foll2 < "| fllallFvlle = e' lf loellYll2, 


where c” := 2/2rc'. Thus, h,, satisfy Hypothesis II of Theorem II.8.4, with 
C =c"' |f\lx independent of m. 


Claim. h,, satisfies Hypothesis I of Theorem II.8.4 with K = K'||f||x, where 
K’' is a constant independent of m. 


Assuming the claim, it follows from Theorem II.8.4 that 


Ibm * Ulp S Cpllfllallllp (m EN) (20) 


for all p € (1,00), where C, is a constant depending only on p. 
Let ~,x € S. By (19) 


|smFWFX| < 2c! |IfllalF¥l |Fxl eS cL". 


Since s,, — f pointwise a.e., it follows from the Lebesgue dominated convergence 
theorem that 


lim [ SmFy Fx dx = | fFwFx dz. (21) 
m JR R 
On the other hand, by Parseval’s identity and (20) (with p’ = p/(p — 1)) 
Vir] f smFeFRae| =| f Flim +H) Fede] =| f (hm « w)Rae 
R R R 


S [Pm * Vllollxllp: S Cp IF llallelly Ix lly’ 


for all m € N. Hence 


S (20)-/?Cy [flo lp Ixlly. (22) 


[ sFeF Kae 


Let u € S’ be such that Fu € H. We may then apply (22) to f = Fu. Since 
f Fu = FuFw = (20)? F(u* y), the integral on the left-hand side of (22) is 
equal to (27)~1/? f.,(u* w)x dx (by Parseval’s identity). Hence 


S< Cp ll fll ello IIxll (x € S$). 


[(wxwrae 


Therefore 


lu * Vllp S Cp Ilflle Illp (bE S). (23) 


540 Application II Distributions 


This proves the following result (once the “claim” is verified). 


Theorem II.8.6. Let u € S’(R) be such that Fu € H. Then for each p € (1,00), 
the map T: € S > uxw is of strong type (p,p), with strong (p,p)-norm 
< Cp ||Fullu, where the constant C, depends only on p, and is a bounded function 
of p in any compact subset of (1,00). 


Proof of the “claim”. The change of variables tz — x and ty > y shows that 
we must prove the estimate 


/ o [Rm (a — y) — hy (x)| dx < Kk (24) 


for all t > 0 and |y| < t. Since hm = D2),)<m Jk, We consider the corresponding 
integrals for gz. 


Vc jue) ~gele)lde sf jn(e- ldo f \ge(x)| dx. (25) 


|x|>2t 


The change of variable x’ = x — y in the first integral on the right-hand 
side tranforms it to froin py [soe |9e(2")| da’. However, for |y| < t, we have 
{a’; |v’ + y| > 2t} C {a’; |x’| > t}; therefore the first integral (and trivially, the 
second as well) is < Sici>t lox (a)| da. 

By the Cauchy—Schwarz’s inequality and (17), 


1/2 1/2 
d 
[ lgvw)lae < ( fo+ 2*1?)Igu(e)/? de) ( | sss | 
|el>e R eles 


1/2 
<( et) || Fllga7*/? (/ x? is) 
|o|>t 


= /2(1 + c?) || flle(2*t)- 1”. 


Therefore, for all t > 0 and |y| < t, 
i HEY) ode 2VATF ATI” 8) 
x|>2t 


Another estimate of the integral on the left-hand side of (26) is obtained by 


writing it in the form 
wy 
[lf xs 
|a|>2t |va 


The integrand of the outer integral is < f” "|gi(s)|ds for y < 0 
(< f a \9,,(s)| ds for y > 0). Therefore, by Tonelli’s theorem, the expression in 


x 


(27) is < falfi., de)lgk(s)| ds for y <0 (< fal [2% aa)lgh(s)| ds for y > 0, 
respectively) = [yl llgkll, < tllgfls- 


da. (27) 


II.8. Convolution operators 541 


Since supp fe C 2*S Cc B(0,2***) and g, = F~' fy = F fx, Theorem 11.3.6) 
implies that g;, extends to C as an entire function of exponential type < 2**!, and 
is bounded on R. By Bernstein’s inequality (cf. Example in Section II.3.6, (31)) 
and (18) 

IIgella S 2°**[Igalla < el Flae2"™, 


and it follows that the integral on the left-hand side of (26) is < 2c’|| f||,2*t. 
Hence (for all ¢ > 0 and |y| < t), 


i og IR — 8) — Ge(@)| de SC [flee min(2"e, Qa), 
a|>2t 


where C = 2,/n(1 + c?). It follows that 


/ es |Ptm (2 — y) — hm(x)| da < C\lflloe >) min(2*t, (2"4)-/?). (28) 


keZ 
We split the sum as 0,67 + ope ge, Where 
J := {ke Z;2*t < (2*t)-/*} = {k; 2"t <1} = {ks k = —Jj,j > log, t}, 


and J° := Z— J. We have 


1 1 1 
d=t dD at by es a De Qu-ee, J = 


ket joalogs t j—log, t>0 j—log, t>0 


Similarly 


S- = 7-1/2 S- (1/V2)* -~ > (1//2)# +082 t 


ke Je k>— log, t k+log, t>0 
1 
© S anaitende 1 
k+logy t>0 1— (1/v2) 


We then conclude from (28) that h, satisfies Hypothesis I of Theorem II.8.4 
with K = K"|\f\|q, where K’ = C(2 + (2/2 — 1)) is independent of m. 


Notation. Let 
K:={feL°; MDfeL™}, 


where L° := L*(IR), M denotes the multiplication by x operator, and D is 
understood in the distribution sense. 


The norm on KX is 
If llc = Wflloo + ||MD flo: 


If f € K, we have for all k € Z 


278/71 fella < 287 | flo 2S]? = V3llf loo, 


542 Application II Distributions 


and 


28/2 Df)glla = 2*/2|\(MDP)(1/2)Full 
1/2 
f a? dx = on 
<|MDf lo (2 f e-ax) = v3IaDJ| 


Therefore 


II fll < V3llfllx 
and K CH. We then have the following. 


Corollary I1.8.7. Let u € S’ be such that Fu € K. Then for each p € (1,00), 
the map T: € S > uxw is of strong type (p,p), with strong (p,p)-norm 
< Cp||Fullc. 


(The constant C, here may be taken as V3 times the constant C> in 
Theorem II.8.6.) 
II.9 Some holomorphic semigroups 
II.9.1. We shall apply Corollary II.8.7 to the study of some holomorphic 
semigroups of operators and their boundary groups. 
Let Ct = {z € C;Rz > 0}. For any z € C* and e€ > 0, consider the function 
K@Qearey eer a son) (1) 


and K,,-(z) =0 for x < 0. 
Clearly, Kz, € L+ Cc 8’, and a calculation using residues shows that 


(FK.,e)(y) = (1/V2m)(2 + y?) 7/7 areal 9), (2) 


We get easily from (2) that 


zy 
MDFK,z,.)(y) = -——— (F Kz) (y). 3 
( JY) = ~ ay F Kae) (y) (3) 

Hence 
|MDFKz,e| < |z||FKz,¢l. (4) 


Therefore, for all z = s+it, s€ Rt, tER, 


IF Kelle S A+ lzFKz,elloo S (1/V2m)e* e™H/7(1 + |z/). (5) 


By Corollary II.8.7, it follows from (5) that the operator 


Tre: ff Ke * f 


II.9. Some holomorphic semigroups 543 


acting on L?(R) (1 < p < oo) has B(L?(R))-norm < Ce~* e™!#!/2(1 + |z|), where 
C' is a constant depending only on p. In the special case p = 2, the factor 
C(1+|z|) can be omitted from the estimate, since 


I|Zz,eflla = l|Ke,e * fll2 = |F( Kee * lla = V2a|(FKz,)(F lla 
< V2a||FKzelloollflla < * e" | flla, 


by (2) and the fact that F is isometric on L?(R). 

Consider L?(R*) as the closed subspace of L?(R) consisting of all (equivalence 
classes of) functions in L?(R) vanishing a.e. on (—oo,0). This is an invariant 
subspace for T,,-, and 


(Te,ef)(«) =T(z)™* LG —y) te? fy)dy (fe T7(R*)). (6) 
We have for all f € L?(R*) 


|Tz,ef lle>crt) = ||Tz,ef ll cece) < [ITz,ella(zeeey) || f ll cece 
= ||Tz ll accep) lI F leer) 


hence 
IIT.,ellazerty) S |ITz,ell are S Cee? (1 + |2)) (7) 


for all z= s+ it € C*. (Again, the factor C(1+|z|) can be omitted in (7) in the 
special case p = 2.) 
For z,w € Ct and y € R, we have by (14) in II.3.2 and (2) 


[F(Kz,¢* Kw,)](y) = V2 [(FKz,c)(FKw,)|(y) 
= (1/V 2m) (22 + y2)— @+¥)/2 eile +w) arctan(y/e) 
= [FKz4w,€l(y)- 


By the uniqueness property of the L!-Fourier transform (cf. II.3.3), it follows 
that 
Kye * Kw, = Nztwye: (8) 


Therefore, for all f € L?, 
(DeeTw,e) f = Tz,e(Tu ef) = Kaz,e * (Ku * f) 
= (Kok Kae) * f = Kepoe #f = Tepes. 
Thus, z > T.,< is a semigroup of operators on C*, ice., 
TD ee Tape ee, (9) 


For any N > 0, consider the space L?(0, N) (with Lebesgue measure) and the 
classical Riemann—Liouville fractional integration operators 


(J* f)(a) =r [@= 94) au (f € L?(0, N)). 


544 Application II Distributions 


It is known that z > J* € B(L?(0,N)) is strongly continuous in Ct. Since 


Il(Tz,¢ — Tue) llzeco,ny S I(J* — J”) f)llze o,n) (10) 


the function z > T,,. € B(L”(0, N)) is strongly continuous as well. For the space 
L”(R*), we have by (10) 


|Tz.ef — Tw ef llEo( Rt) < | (J? J” \(e* fl Eo(0,~) 


+ | eH P221T. -n(*/? f)/P dx 
N 


+ i em P#2T, 9 (et*/2f) |? da. 
N 


For f € C.(Rt), e/?f € L(R*), and therefore, for all z = s+it and w = u+iv 
in Ct, it follows from (7) that the sum of the integrals over (N,oo) can be 
estimated by 


e PN/2CP [(</2)-™* e7PIt/2(1 4 |xl)P 
He (e/ 2) Pe eee | ile oe oats: 
Fix z € Ct, and let M be a bound for the expression in square brackets when w 


belongs to some closed disc B(z,r) C C+. Given 6 > 0, we may choose N such 
that 


oe PN/207 lee? Fleeces) =e 


For this N, it then follows from the strong continuity of J* on L?(0, N) that 


lim sup || (T:,¢ — Tw,e) fllz>@+) < 6 


Wz 
Thus, T,,.<f > T,,.f as w — z for all f € C.(R*). Since ||Ty<lla(ze(r+)) is 
bounded for w in compact subsets of C* (by (7)) and C,(R™) is dense in L?(RT), 
it follows that T,,,. > Tz, in the strong operator topology (as w — z). Thus 
the function z + T-,. is strongly continuous in Ct. A similar argument (based 
on the corresponding known property of J*) shows that T,,. — I strongly, as 
z € C* + 0. (Another way to prove this is to rely on the strong continuity of T. . 
at z = 1, which was proved earlier; by the semigroup property, it follows that 
T.,<f — f in L?(Rt)-norm for all f in the range of T;,<, which is easily seen to 
be dense in L?(IRt). Since the B(L?(R*))-norms of T.,. are uniformly bounded 
in the rectangle Q := {z = s +it;0 <8 < 1, |t| < 1}, the result follows.) An 
application of Morera’s theorem shows now that T.,. is an analytic function of 
zin Ct. (T_,, is said to be a holomorphic semigroup on C*.) 
Fix t ¢ R and 6 > 0. For z€ Bt (it, 6) := {z € Ct; |z — it| < 6}, we have 
by (7) 
||Tz,¢l| < C max(e~, 1)e(*/2) (141+) (1 + |e] +6). (11) 


II.9. Some holomorphic semigroups 545 


If z, € Bt (it, 5) converge to it and f = T,,<g for some g € L?(R*), then 
een = Tz, 41,69 = Tit+1,9 


in L?(R*), by strong continuity of T,,. at the point it+ 1. Thus, {T.,,.f} is 
Cauchy in L?(Rt) for each f in the dense range of T;,-, and therefore, by (11), 
it is Cauchy for all f € D?(R*). If z/, € Bt(it,5) also converge to it, then 
T:,, .<f —Tz,,<f + 0 in L”(R*) for all f in the range of T;,< (by strong continuity 
of T,,. at z = 1+ it), hence, for all f € L?(Rt), by (11) and the density of 
the said range. Therefore the L?-limit of the Cauchy sequence {T,, .f} (for 
each f € L?(R*)) exists and is independent of the particular sequence {z,,} in 
B*(it,5). This limit (denoted as usual lim,-,i: Tz,<f) defines a linear operator 
denoted by Tiz,-. By (11) 


IIZief llp SC max(e~?, 1lel™/41+9)(1 + |e] + d)IIFllp. 
where the norms are L?(R*)-norms. Since 5 > 0 is arbitrary, we conclude that 
Tiel] < Ce. + |e!) GER) (12) 
where the norm is the B(L?(R*))-norm and C depends only on p. (The factor 
C(1 + |t|) can be omitted in case p = 2.) 
Since T,,.< is a bounded operator on L?(Rt) (cf. (7)), we have for each 
weéeCt,teR, and f € L?(R*), 
Tw elit, f ae lim Twielzef = lim dappgiedt — Tw+it,ef- 
z—it z—it : 
Also, by definition, 
Titelwef — lim Tel wef — lim Te+wet = Twit,ef- 
zit zit : 


Thus 
Tw elit, f = Tit,elwief = Tw+it,ef (13) 


for all w € Ct and te R. (In particular, for alls ¢ Rt andt ER, 
Ts+it,e — T’s,¢Lit, = Tyt,<T's,c-) 


Letting w — is in (13) (for any s € R), it follows from the definition of the 
operators Tj,,¢ and their boundedness over L?(Rt) that 


Tis,eLit,ef = Tys-+t),ef (s,t € R; f € L?(R*)). (14) 


Thus, {Ti,<; t € R} is a group of operators, called the boundary group of 
the holomorphic semigroup {T-.,-; z € C+}. The boundary group is strongly 
continuous. (We use the preceding argument: for f = T1,-g for some g € L?, 


Tisey: =. Ti +is,<J => Ti +it,<9 = Lites 


546 Application II Distributions 


as s — t, by strong continuity of T,,. at z = 1-+it. By (12), the same is true for 
all f € L?, since T;,, has dense range in L?.) We formalize the previous results 
as follows. 


Theorem II.9.2. For each « > 0 and p € (1,00), the family of operators 
{T.,<32 € Ct} defined by (6) has the following properties: 

(1) It is a holomorphic semigroup of operators in L?(R*); 

(2) limzec+-30 Tz, =I in the s.0.t.; 


(3) ||Tzel| < CA + lz|em8 e412 for all z = s +it € Ct, where the norm is 
the operator norm on L?(R*) and C is a constant depending only on p 
(in case p = 2, the factor C(1 + |z|) can be omitted). 


(4) The boundary group 


Tit, = (strong) lim Ty... (t€R) 
ze€Ct it 


exists, and is a strongly continuous group of operators in L?(RT) with 
Properties (12) and (18). 


We may apply the theorem to the classical Riemann—Liouville semigroup J* 
on L?(0, N) (N > 0). Elements of L?(0, .N) are regarded as elements of L?(Rt) 
vanishing outside the interval (0, N). All p-norms below are L?(0, N)-norms. The 
inequality (7) takes the form 


I|T.,cllaqe(o,n)) < CU + lzle re"? (z=stite Ct). (15) 
For all f € L?(0,N), 
lJ flip < Azllillfllp, 
where K,(2) = ['(z)~x*~1 for x € (0, N). Calculating the L'-norm above we 


then have 
s 


z N ee 
| J7 || < are (z=s+iteCr). (16) 


Since 0 < 1—e7«(#-¥) <e¢(a—y) forO<y<a<N, we get forz=s+it Ee Cr 


I(s+1) 


J? f —Tref| <e seek P 
FP - Toefl SPI 
and therefore by (16) 
2 ae 
IF P —Toefll SE ile (17) 


By (15) and (17), we have for all z€ Q:= {z=s+ite€Ct;s <1,|t| <1} 


Jf llp < llJ*f — Tefllp + IITz1f lle <M |Ifllp, (18) 


II.9. Some holomorphic semigroups DAT 


where 
N8t+1,/2 


M= su 
o<s<ijlel<i (s+ 1 + it)| 


+O(1+/2)e"?. 


Given t € R, let n = [|¢|] + 1. Then z = s+it € nQ for s < 1, and therefore, by 
the semigroup property (with w = z/n € Q), 


J7I] = Ir I] = F224) < oe" < a = Mlle, 


This inequality is surely valid in a neighborhood BT (it, 6), and the argument 
of Section II.8 implies the existence of the boundary group J‘, defined as the 
strong limit of J* as z > it. By (17) and (15), we also have (for z = s+it € C*) 


ites eosaewerl oper 
___ lanes 
~ “(s+)(P(e+ 1) 


IIfllp + CC. + [ale Fe"! FI]. 


Letting s > 0, we get for all f € L?(0,N) 


lt] N 
IP(it + 1)| 


lJ" fllp <€ IIfllp + CC + leer IF 


Since € is arbitrary, we conclude that 
|| F#|| < CO + te? (te R). (19) 
As before, the factor C(1 + |t|) can be omitted in case p = 2. Formally: 


Corollary II.9.3. For each p € (1,00) and N > 0, the Riemann-—Liouville 
semigroup J* on L”(0,N) has a boundary group J‘* (defined as the strong limit 
of J? as z€ Ct >it). The boundary group is strongly continuous, satisfies and 
it the identity J* J” = J’ J* = J’ (for allt € R and w € Ct) and the growth 
relation (19). 


Bibliography 


Akhiezer, N. I. and Glazman, I. M. (1962, 1963). Theory of Linear Operators 
in Hilbert Space, Vols I, II, Ungar, New York. 


Ash, R. B. (1972). Real Analysis and Probability, Academic Press, New York. 


Bachman, G. and Narici, L. (1966). Functional Analysis, Academic Press, 
New York. 

Banach, S. (1932). Theorie des Operations Lineaires, Hafner, New York. 
Berberian, S. K. (1961). Introduction to Hilbert Space, Oxford University 
Press, Fair Lawn, NJ. 

Berberian, S. K. (1965). Measure and Integration, Macmillan, New York. 
Berberian, S. K. (1974). Lectures in Functional Analysis and Operator 
Theory, Springer-Verlag, New York. 

Blackadar, B. (2006). Operator Algebras, Theory of C*-Algebras and von 
Neumann Algebras, Springer-Verlag, Berlin. 


Bonsall, F. F. and Duncan, J. (1973). Complete Normed Algebras, Springer- 
Verlag, Berlin. 

Bourbaki, N. (1953, 1955). Espaces Vectoriels Topologiques, livre V, 
Hermann, Paris. 

Browder, A. (1968). Introduction to Function Algebras, Benjamin, New York. 


Brown, N. P. and Ozawa, N. (2008). C*-Algebras and Finite-Dimensional 
Approximations, Providence, Rhode Island. 


Brown, A. and Pearcy, C. (1977). Introduction to Operator Theory, Springer- 
Verlag, New York. 


Colojoara, I. and Foias, C. (1968). Theory of Generalized Spectral Operators, 
Gordon and Breach, New York. 


Davidson, K. R. (1996). C*-Algebras by Example, Providence, RI. 

Davies, E. B. (1980). One-Parameter Semigroups, Academic Press, London. 
Dieudonne, (1960). Foundations of Modern Analysis, Academic Press, 
New York. 

Dixmier, J. (1964). Les C*-algebres et leurs Representations, Gauthier- 
Villar, Paris. 

Dixmier, J. (1969). Les Algebres d’Operateurs dans lV’Espace Hilbertien, 
2nd ed., Gauthier-Villar, Paris. 


Bibliography 


Doob, J. L. (1953). Stochastic Processes, Wiley, New York. 

Douglas, R. G. (1972). Banach Algebra Techniques in Operator Theory, 
Academic Press, New York. 

Dowson, H. R. (1978). Spectral Theory of Linear Operators, Academic Press, 
London. 

Dunford, N. and Schwartz, J. T. (1958, 1963, 1972). Linear Operators, 
Parts I, I, III, Interscience, New York. 

Edwards, R. E. (1965). Functional Analysis, Holt, Rinehart, and Winston, 
Inc., New York. 

Friedman, A. (1963). Generalized Functions and Partial Differential 
Equations, Prentice-Hall, Englewood Cliffs, NJ. 

Friedman, A. (1970). Foundations of Modern Analysis, Holt, Rinehart, and 
Winston, New York. 

Gamelin, T. W. (1969). Uniform Algebras, Prentice-Hall, Englewood Cliffs, 
NJ. 

Goffman, C. and Pedrick, G. (1965). First Course in Functional Analysis, 
Prentice-Hall, Englewood Cliffs, NJ. 

Goldberg, S. (1966). Unbounded Linear Operators, McGraw-Hill, New York. 
Goldstein, J. A. (1985). Semigroups of Operators and Applications, Oxford, 
New York. 

Halmos, P. R. (1950). Measure Theory, Van Nostrand, New York. 

Halmos, P. R. (1951). Introduction to Hilbert Space and the Theory of Spectral 
Multiplicity, Chelsea, New York. 

Halmos, P. R. (1967). A Hilbert Space Problem Book, Van Nostrand- 
Reinhold, Princeton, NJ. 

Hewitt, E. and Ross, K. A. (1963, 1970). Abstract Harmonic Analysis, Vols. 
I, I, Springer-Verlag, Berlin. 

Hewitt, E. and Stromberg, K. (1965). Real and Abstract Analysis, Springer- 
Verlag, New York. 

Hille, E. and Phillips, R. S. (1957). Functional Analysis and Semigroups, 
A.M.S. Colloq. Publ. 31, Providence, RI. 

Hirsch, F. and Lacombe, G. (1999). Elements of Functional Analysis, 
Springer, New York. 

Hoffman, K. (1962). Banach Spaces of Analytic Functions, Prentice-Hall, 
Englewood Cliffs, NJ. 

Hormander, L. (1960). Estimates for translation invariant operators, Acta 
Math., 104, 93-140. 

Hormander, L. (1963). Linear Partial Differential Equations, Springer- 
Verlag, Berlin. 


Kadison, R. V. and Ringrose, J. R. (1997). Fundamentals of the Theory 
of Operator Algebras, Vols. I, II, II, A.M.S. Grad. Studies in Math., 
Providence, RI. 


Bibliography 551 


Kantorovitz, S. (1983). Spectral Theory of Banach Space Operators, Lecture 
Notes in Math., Vol. 1012, Springer, Berlin. 


Kantorovitz, S. (1995). Semigroups of Operators and Spectral Theory, 
Pitman Research Notes in Math., Vol. 330, Longman, New York. 


Kantorovitz, S. (2010). Topics in Operator Semigroups, Birkhauser, Boston, 
MA. 


Kato, T. (1966). Perturbation Theory for Linear Operators, Springer-Verlag, 
New York. 


Katznelson, Y. (1968). An Introduction to Harmonic Analysis, Wiley, 
New York. 


Kothe, G. (1969). Topological Vector Spaces, Springer-Verlag, New York. 
Lang, S. (1969). Analysis IT, Addison-Wesley, Reading, MA. 

Larsen, R. (1973). Banach Algebras, Marcel Dekker, New York. 

Larsen, R. (1973). Functional Analysis, Marcel Dekker, New York. 

Loeve, M. (1963). Probability Theory, Van Nostrand-Reinhold, Princeton, 
NJ. 


Loomis, L. H. (1953). An Introduction to Abstract Harmonic Analysis, Van 
Nostrand, New York. 


Malliavin, P. (1995). Integration and Probability, Springer-Verlag, New York. 


Maurin, K. (1967). Methods of Hilbert Spaces, Polish Scientific Publishers, 
Warsaw. 


Megginson, R. E. (1998). An Introduction to Banach Space Theory, Springer, 
New York. 

Munroe, M. E. (1971). Introduction to Measure and Integration, 2nd ed., 
Addison-Wesley, Reading, MA. 

Murphy, G. J. (1990). C*-Algebras and Operator Theory, Academic Press, 
Boston, MA. 

Naimark, M. A. (1959). Normed Rings, Noordhoff, Groningen. 


Palmer, T. W. (1994, 2001). Banach Algebras and the General Theory of 
«-Algebras, Vols I, Il, Cambridge University Press, Cambridge. 


Pazy, A. (1983). Semigroups of Linear Operators and Applications to Partial 
Differential Equations, Springer, New York. 

Pedersen, G. K. (2018). C*-Algebras and their Automorphism Groups, 2nd 
ed., Academic Press, London. 

Reed, M. and Simon, B. (1975). Methods of Modern Mathematical Physics, 
Vols I, Il, Academic Press, New York. 

Rickart, C. E. (1960). General Theory of Banach Algebras, Van Nostrand- 
Reinhold, Princeton, NJ. 

Riesz, F. and Sz-Nagy, B. (1955). Functional Analysis, Ungar, New York. 
Royden, H. L. (1968). Real Analysis, 2nd ed., Macmillan, New York. 
Rudin, W. (1962). Fourier Analysis on Groups, Interscience-Wiley, 
New York. 


Bibliography 


Rudin, W. (1973). Functional Analysis, McGraw-Hill, New York. 


Rudin, W. (1974). Real and Complex Analysis, 2nd ed., McGraw-Hill, 
New York. 


Sakai, S. (1971). C*-Algebras and W*-Algebras, Springer-Verlag, New York. 


Schmiidgen, K. (2012). Unbounded Self-Adjoint Operators on Hilbert Space, 
Springer, Dordrecht. 


Schwartz, L. (1951). Theorie des Distributions, Vols I, II, Hermann, Paris. 


Stone, M. (1932). Linear Transformations in Hilbert Space, A.M.S. Colloq. 
Publ. 15, Providence, RI. 


Stout, E. L. (1971). The Theory of Uniform Algebras, Bogden and Quigley, 
New York. 


Stratila, §. (1981). Modular Theory in Operator Algebras, Abacus Press, 
Tunbridge Wells. 

Takesaki, M. (2002, 2003, 2003). Theory of Operator Algebras, Vols I, II, III, 
Springer-Verlag, Berlin. 

Taylor, A. E. (1958). Introduction to Functional Analysis, Wiley, New York. 


Tucker, H. G. (1967). A Graduate Course in Probability, Academic Press, 
New York. 


Weil, A. (1951). L’integration dans les Groupes Topologiques et ses 
Applications, 2nd ed., Act. Sci. et Ind. 869, 1145, Hermann et Cie, Paris. 


Wheeden, R. L. and Zygmund, A. (1977). Measure and Integral, Marcel 
Dekker, New York. 


Wilansky, A. (1964). Functional Analysis, Blaisdell, New York. 
Yosida, K. (1966). Functional Analysis, Springer-Verlag, Berlin. 
Zaanen, A. C. (1953). Linear Analysis, North-Holland, Amsterdam. 


Zygmund, A. (1959). Trigonometric Series, 2nd ed., Cambridge University 
Press, Cambridge. 


Index 


abelian groups 126 
abstract Cauchy problem (ACP) 312 
abstract harmonic analysis 107 
abstract measure theory 2 
abstract potential 313 
adjoint equation P(-D)e = 0 523 
adjoint operator of T’ 293, 357 
adjoints, operators 165, 262, 289, 
296-9, 357 
Alaoglu’s theorem 130, 140, 346, 367 
Alexandroff one-point 
compactification 82 
algebras 
commutative 170 
generated by semi-algebra 59-61 
homomorphism 174, 191, 195, 202, 
223, 239, 241, 270 
representation 274 
polynomials 199 
see also Banach -; Borel —; C*-; 
von Neumann —; W* — 
analytic functions 286 
analytic operational calculus 274—7 
anti-automorphism 171, 195 
anti-multiplicativity 181, 207 
antisymmetric set 152 
approximate identities 127, 323, 
332-4, 398 
Banach algebras 332-4 
approximation, almost everywhere by 
continuous functions 105 
approximation theorem 7-8, 14, 26 
arc-length measure 232 
Arens products 194, 212-15 
Arzela—Ascoli theorem 130, 161-2, 
283 


automatic regularity 101 
averages lemma 35-6, 46, 105 
axiom of choice 69 


B*-algebra 
defined 324 
subalgebra 201-2 
Baire’s category theorem 173, 174, 
ileare 
Banach algebras 193-202 
approximate identities 332-4 
Arens products on A** 194, 212-15 
bounded complex Borel functions 
on C or R 255 
B(X), all bounded linear operators 
on a Hilbert space X 208 
commutative 203-7 
complex functions 194 
continuous derivatives 273 
continuous functions on K 254—5 
Gelfand representation 193, 208 
involutions and C*-algebras 207-11 
logarithms 284 
nilpotent 201 
normed 194 
quasi-nilpotent 201 
self-adjoint 207 
semi-simple 207 
unital 195, 196, 199, 209 
Banach subalgebra 201-2 
Banach space 28, 30, 130, 169, 173 
adjoint of T 135 


554 


Banach space (Continued) 
B(X) 194 
Co(X) 194 
closed subspace 181 
closed unit ball S$ 145 
dual, defined 365 
identity map 179 
isomorphic 135 
linear operators 290-3 
maximal Banach subspace Z of 
X 266 
norm induced by inner product 225 
projection on 228 
reflexive 270 
closed subspace 134 
if its conjugate is reflexive 135, 
145 
restriction map 134 
subspace, spectral measure 254-5 
subspace of X 301 
X, linear subspaces of X 290 
Banach subalgebra 201-2 
Beppo Levi theorem 14, 71, 465 
Bernoulli 
case 416 
law of large numbers 418-19 
random variables 416, 428, 440-1 
Bernstein’s inequality 514, 541 
Bessel 
identity 233 
inequality 227, 229, 232 
Beurling—-Gelfand spectral radius 
formula 200 
BienAyme’s identity 415, 441, 451, 
452, 477 
bijective map 159, 207, 235, 236, 262, 
291, 321, 330, 383 
continuous open map 179, 211 
isometric linear map 235 
binomial coefficients 411 
binormal density 473 
binormal distribution 473-5 
Bishop’s antisymmetry theorem 130, 
145-50 
Bolzano-—Weierstrass theorem 436 
Borel algebra 69, 70-1, 116 


Index 


Borel functions 298, 444, 460, 466-70 
bounded 148, 256, 261, 299 
non-negative 425 

Borel map 3 

Borel measure 81, 98-100, 116, 148 
complex 107, 114, 115, 255, 323-4 
finite positive 264 
integration 255 
positive regular 261 

Borel sets 2, 98, 117, 412, 442 
subsets 412 

Borel strong law of large numbers 420 

Borel—Cantelli lemma 422, 423 

boundary groups 542, 545-7 

bounded linear functionals 129 
absolute value 350 

bounded linear operators 107 
on Hilbert space X 208 

bounded operator theory 173-92 
operator topologies 182-4 
operators on X 109 
self adjoint operator, resolution of 

identity 298 
weak operator topology 183 

bounded weak* -topology 130, 165-9 

boundedness condition 50 

bounding point 141 


C*-algebras 193, 207-11, 323-54 
*homomorphisms 323 
approximate identities 332-4 
Banach algebra 193 
closed ideals 333-5 
concrete 323 
constructions 383-408 
element of the form x*x 323 
enveloping von Neumann algebra 

A** 356 

C*-algebra 376-9 
group C*-algebras 397-403, 406-7 
involution 193 
irreducible 345 
non-commutative topology 356 
normal elements 211-12 
notation and examples 324—5 


Index 


positive elements 327-32 

positive linear functionals 335-40 

positivity, of elements and 
functionals 323, 327 

quasi-nilpotent element 210 

selfadjoint element 210 

tensor products 383-97 

universal representation 345 

see also von Neumann algebras 

calculus 


analytic operational calculus 274-7 


C(K)-operational calculus 257 
continuous operational 
calculus 325-7 
fundamental theorem 104—5 
H(K)-operational calculus 276 


polynomial operational calculus 267 


unbounded selfadjoint 
operators 289, 298-9 
Calderon—Zygmund decomposition 
lemma 532 
canonical embedding, of X onto 
X** 134 
canonical isometric isomorphism 365 
Cantor functions 103 
Carathéodory 


constructing positive measures from 


primitive objects 59 
extension theorem 64 
measurability condition 63 

cardinality, canonical model 237 
Cartesian product X x Y, two 
normed spaces 179 
Case Z = X 257 
category theorem 173, 174, 177 
Cauchy density 431 
abstract Cauchy problem 
(ACP) 312 
Cauchy integral formula 274, 277 
applied to function fq 521 
vector-valued version 274 
Cauchy integral theorem 429, 431, 
512 
Cauchy sequences 1, 29, 48-9, 180, 

545 
[x,], convergent subsequence 181 
in X/M 181 


555 


Cauchy—Schwarz inequality 231, 236, 
310, 341, 540 
Cayley transform 286, 
300, 301 
central limit theorems 
for equidistributed random 
variables 441 
for uniformly bounded random 
variables 440 
see also Laplace 
Cesaro mean 242 
x-squared distribution 432 
claim solution 529-31, 539-40 
classical Volterra operator 219 
closed graph theorem 173, 180, 221, 
290, 291, 294 
closed ideals, C*-algebras 
333-5 
closed operators 180, 185, 290, 291, 
303-5, 312 
commutation with translations 502 
commutative semigroup 268 
commutative von Neumann 
algebras 375-6 
commutativity 320 
compact convex set 137 
compact Hausdorff space 130, 137, 
148, 193 
compact metric space 280, 283 
compact neighborhood 82 
compact operators 279-82 
normal operators 284 
Riesz—Schauder theorem 253 
complementary event 409 
complementary projection 228 
complete normed space 109 
complete orthonormal bases/sets 232 
complete reflexive space 134 
completeness hypothesis 178 
complex matrices, Jordan canonical 
form 260 
complex measures 508 
natural operations 43 
complex vector space 17, 31, 43, 96, 
136, 141, 250, 307, 308 
complex-valued continuous 
functions 81, 160, 195 


556 


composite function theorem 276 
composition theorems 212 
conditional distribution 
continuous case 472-3 
discrete case 471—2 
conditional probability 462 
conditioning, by a random 
variable 467—71 
confidence intervals 452 
conjugate space X* 107 
for arbitrary normed space X 129 
normed space 109 
conjugate spaces 
of any normed space, complete 109 
conjugate of C.(X) 110-14 
Lebesgue spaces 107 
conjugation operator 136 
consistency, definition 453 
construction of measures 59-64 
continuity of the inverse 179 
continuous functions, 
complex-valued 81, 160, 195 
continuous linear functionals 226 
on Hilbert spaces 107-28 
continuous linear operators 508 
continuous operational calculus 325-7 
continuous spectrum 292 
contractive class C’ 268 
contractive C(R)-operational calculus 
for T on Z 300 
contractive spectral measure on Z 302 
convergence 47—50 
on finite measure space 50-1 
Lebesgue’s theorems 20, 71, 74, 
125, 262, 264, 298, 519 
of semigroups 316 
submartingale convergence 
theorem 489 
convex combination, vectors 137 
convex hull 138, 157 
convolution 
of distributions 502 
and Fourier transform 77 
of functions 503 
with u commutes with 
translations 502 
of uand @ 500 


Index 


convolution operators 531-42 
core 

for a closed operator 291 

for generator of semigroup 314 
covariance 414 
critical value 459 
cyclic vector 342, 375 

spatial isomorphism 376 


decompositions of functionals 
349-50 
deficiency indices of T' 304 
deficiency spaces of T 305 
delta measure 496, 503 
DeMorgan law 2 
dual laws 409, 420 
dense domain D(T), in Hilbert 
space 293 
density 
Cauchy density 431 
distributions with 429, 449 
exponential distribution 
density 431, 432 
F-, Snedecor density 449, 450 
Gamma density 431-2 
Kaplansky’s density theorem 355, 
Ua; oat 
Laplace density 430 
normal 429 
Student, with v degrees of 
freedom 447-8, 456, 461 
von Neumann algebras 362-3 
diagonal matrix 253 
Dieudonné’s theorem 166-7 
differentiability, of measures 
97-101 
differential operator P(D), 
fundamental solutions 520-3 
Dirichlet integral 433, 507 
disjoint unions 61 
distance theorem, Hilbert space 33 
distributions 491-547 
binormal 473-5 
convolution operators 531—42 
with density 429, 449 
functions 51-3, 425-6, 431 


Index 


fundamental solutions 520-3 
holomorphic semigroups 542-7 
multiplication of a distribution by a 
function 498 
preliminaries 491-3 
regularity of solutions 525-8 
spaces and dual spaces 499 
temperate distributions 503-19 
variable coefficients 528-31 
divergence, Fourier series 245 
dominated convergence theorem 
for conditional expectation 466 
of Lebesgue 20, 71, 74, 125, 262, 
298, 519 
double dual 356 
double integral 59, 72 
dual space 499, 507 
duality 129-72 


Egoroff’s theorem 1, 50, 57 
Ehrenpreis—~Malgrange—Hormander 
theorem 520 
eigenspace 262 
eigenvalues 262 
eigenvectors 263 
embedding, reflexive 129 
empty event 409, 411 
enveloping von Neumann algebra 356, 
376-9 
equicardinality, Hilbert dimension 234 
equivalence law of ordered 
sampling 411, 417 
Erdos—Renyi theorem 423, 424 
estimator of 8 451 
Euclidean space Rk 60 
events 
complementary 409 
empty, sure, equivalent 409, 411 
pairwise independent events 424 
sample space 410 
sequence 419 
exponential distribution density 431, 
432 
exponential formulas 317 


557 


exponential type functions 513 
extremal points, Krein—Milman 
theorem 146-7 


F-density (Snedecor density) 449, 450 
Fatou’s lemma 138, 20, 28-9, 236, 298, 
486-7 
finite measure space 50-1 
finite positive, Borel measure 264 
Fisher’s theorem 444—50 
fixed points 157-65 
Markov—Kakutani theorem 130, 
157-9 
Fourier 
coefficients 243 
of L? functions 246 
expansion 232 
inversion formula 508, 512 
series, divergence 245 
transform 407 
Parseval’s formula 506 
Schwartz space 504, 506 
temperate distribution 508 
Fourier—Laplace transforms of 
distributions with compact 
support 511 
Fourier—Stieltjes transform 508 
fractional integration operators 543 
Friedrichs’ extension 321 
Fubini’s theorem 75-6, 124, 164, 275, 
433, 437, 441, 442, 472, 475, 
505 
fundamental theorem of 
calculus 104-5 


Gamelin 268 

Gamma density 431-2 

Gammaz-distributed random 
variables 432 

Gaussian density 429 

Gelfand representation 193-4, 205, 
206, 208, 216 

Gelfand space 193, 204-5, 209 

Gelfand topology 204, 207 

Gelfand transform of a 286 


558 


Gelfand transform of x 204-5 
Gelfand—Mazur theorem 199, 205 
Gelfand—Naimark theorem 193-4, 
208, 210, 211, 323-4 
commutative 194, 210, 332, 338, 
375 
non-commutative 323-4, 338, 340, 
345 
Gelfand—Naimark—Segal (GNS) 
construction/theorem 324, 
340-6 
general Leibnitz formula 498 
generator of semigroup 311 
core 314 
GNS representation 324, 340-6 
Goldstine’s theorem 130, 144 
graph norm on D(T) 291 
graphs 179-81 
closed graph theorem 180 
groups of operators 318 


Haar measure 107, 118-26 
for compact topological groups 130 
existence and uniqueness 163-5 
Hahn decomposition 47 
Hahn’s fixed point theorem 130, 


159-60 
Hahn-—Banach lemma 128-30, 137, 
139-40, 173 
separation of convex sets in vector 
spaces 137 


Hahn-—Banach theorem 129-34, 150, 
160, 168, 184, 339, 340, 393, 
A97, 521 
Hamel base 238 
Hardy inequality 102 
Hartogs—Rosenthal’s theorem 283 
Hausdorff 
compact/locally compact 
space 130, 137, 148, 193 
space 81, 82, 114, 137, 193, 323-4 
topology 129 
Heine—Borel theorem 145 
Helly—Bray lemma 434-6 
Hermitian elements 307, 349, 350 


Index 


Hessian 454 
Hilbert 
adjoint 293-6 
of densely-defined unbounded 
operators 289 
operation 208, 357 
automorphism of L? 507 
dimension 234-5 
Hilbert spaces 225-52 
bounded linear operators 208 
C*-algebras 324 
canonical model 237 
continuous linear 
functionals 107-28 
deficiency indices of JT 304 
direct sums 236-7 
equicardinality 234 
geometric properties 33-5 
Hilbert dimension 234-5 
infinite-dimensional complex 
253 
inner product 290 
isomorphism 235-6, 265 
of L? 513 
notation 356 
orthogonal decomposition 
theorem 34 
orthonormal bases 231-4 
orthonormal sets 225-8 
projections 228-30 
quadratic forms 307-11, 321-2 
reflexive 136 
special case of Banach space 32 
symmetric operators 303-7 
tensor products 240-1 
Hilbert-Schmidt operators 356, 
369-71 
Hille-Yosida space of an arbitrary 
operator 315 
Holder’s inequality 23, 25-6, 30, 155, 
514 
holomorphic semigroups of 
operators 542-7 
homomorphisms 118, 196, 211 
*homomorphisms 323 
algebra 254 


Index 


linear (isomorphism) 135 

of X onto itself 140 
Hormander theorem 536 
(hyper)cubes 532 
hypergeometric random variable 417 


ideals (C*-algebras) 333-5 
identity map, from Banach space 179 
identity operator 179, 256, 270-3, 
292, 298 
improper Riemann-Stieltjes 
integral 52, 71, 426 
independence hypothesis 448 
inductive limit topology 494 
inferior limit (lim inf) 4 
infinite divisibility 481-2 
inner product 30-2, 225, 341 
Hilbert space 290 
sesqui-linearity 226 
integrable functions 16-23, 59, 71, 
286, 414 
integral representation 253-88 
integration 255-60 
non-negative measurable 
functions 10-16 
integration operators 543 
interpolation theorem 130, 152-3 
inverse operator 291 
inversion theorem 433 
invertible elements 193 
involutions 
(anti)multiplicativity 207 
and C*-algebras 193, 207-11 
canonical 207 
isometric 208 
irreducibility of representations of 
C*-algebras 346 
irreducible representations 345-6 
isometric conjugate-isomorphism 
V 136 
isometric isomorphism 137, 366, 368 
natural 107 
isomorphism 
canonical isometric 365 
Hilbert spaces 235-6 
linear homomorphism 135-7 


559 


Jacobian 442, 445 
Jordan canonical form, complex 
matrices 260 
Jordan decomposition 102, 260, 349 
real measure p 44 


Kakutani’s fixed point theorem 130, 
159-61 

Kaplansky’s density theorem 355, 
362, 377 

Kolmogorov inequality 418 

Kolmogorov lemma 475 

Kolmogorov ‘three series 
theorem’ 479 

Krein—Milman theorem 130, 346 

extremal points 146-7 

Krein-Smulian theorem 130, 167-8, 
169, 378 

Kronecker’s delta 127, 225 


L? random variables 400-1, 497, 
504-5, 512 
L? random variables 36, 189, 249, 
251, 375, 401, 406-7, 414, 
475-81, 509 
A, complex with modulus 197 
Laplace 
central limit theorems 440-1 
density 430 
distribution 482 
theorem 440 
transform 312 
Laurent expansion 200, 277 
Lavrentiev’s theorem 268 
Lax—Milgram theorem 188 
Lebesgue 
conjugates 110-14 
decomposition theorem 37, 39-40 
dominated convergence theorem 20, 
71, 74, 125, 262, 264, 298, 519 
dt/2m 225 
dz 441 
Haar measure 107 
integrable 1 
uu 69 


560 


Lebesgue (Continued) 
Marcinkiewicz’s interpolation 
theorem 130, 152-3 


monotone convergence theorem 13, 


27, 38, 52, 414, 465 
planar measure zero 269 
spaces, conjugate of 107 
Lebesgue measure 59, 522-3 
zero 269 
Lebesgue—Radon—Nikodym 
theorem 35-40 
averages lemma 36-7 
see also Riesz representation 
theorem 
Lebesgue-Stieltjes measure 67—70, 
426 
left translation invariance 397 
positive measure on G 107, 119, 
121, 123-5 
Leibnitz formula, general 498 
linear algebra 193, 281 
quadratic forms 307-11, 321-2 
linear maps 143 
between normed spaces 108-9 
linear operators, in a Banach 
space 290-3 
linear regression 470-1 
‘little’ Riesz representation 
theorem 35, 126, 310, 509 
local convexity 129 
locally compact Hausdorff space 
see Hausdorff 
locally convex topological vector 
spaces 129-30, 142, 143, 147, 
159, 167, 171, 183, 494, 499, 
504 
logarithms, Banach algebra 284 
lower semi-continuous 
function 310-11 
D?-spaces 23-30, 56, 70-1, 77, 102, 
113, 516-18, 543-4 
reflexivity for 1 < p> 
infinity 134-7 
supremum and infinum 70 
see also Banach space 
Lusin’s theorem 81, 93-6 


Index 


Lyapounov central limit theorem 437, 


440 
Lyapounov condition 438 


Marcinkiewicz’s interpolation 
theorem 130, 152-3, 157 
Markov—Kakutani’s fixed point 
theorem 130, 157-9 
matrices 
complex matrices 260 
diagonal matrix 253 
‘matrix tricks’ 355 
maximum likelihood estimators 
(MLEs) 453 
measurable sets 
functions 1-8 
structure 65-7 
measure theory 1-58 
abstract measure theory 2 
complex measures 508 
construction of measures 59-64 
distribution function 51-3 
integration of non-negative 
measurable functions 10-16 
non-commutative 356 
positive measures 8-10 
truncation of function 53-4 
measures 
outer 62-4 
on Rk, differentiability 97-101 
space 81 
support of a measure 97 
and topology 81-106 
Milman’s theorem 130, 171 
Minkowski 
functional 140 
inequality 24 
modular function 125 
monotone convergence theorem 
Lebesgue 13, 27, 38, 52, 414, 
465 
see also Lebesgue 
Morera’s theorem 544 
mouse problem 422 
multiplicativity 


Index 


of analytic operational calculus 277 
of E 413 
involution 207 
of norm 194 
of r 257, 273, 275, 303 
submultiplicativity 216 

mutually orthogonal projections 256 


natural isometric isomorphism 107 
natural linear map 365 
natural/canonical embedding, of X 
onto X** 134 
Neumann expansion 197 
of resolvent 275, 531 
Neyman-—Pearson lemma 457-60 
non-commutative Gelfand—Naimark 
theorem 323-4, 338, 340, 345 
non-commutative measure theory 356 
non-commutative Taylor theorem 285 
non-trivial normed vector space, with 
uniform norm 83 
norm-null sequence 167 
norm-decreasing algebra 
homomorphism 302 
norm topology 456 
normal distribution 448 
normal elements 
C*-algebras 194, 207, 211-12, 220 
continuous operational calculus 
of 194, 323, 325, 327 
normal operators 260-2, 264 
normed spaces 6, 18, 20, 28, 29, 32, 
43, 96 
arbitrary linear maps 
between 108-9 
bounded linear mappings 173-9 
Cartesian product X x Y 179 
complete see Banach space 
completeness of 180 
conjugate of C,(X) 114 
conjugate space X* 107, 109 
strongly closed unit ball S* 148, 
362 
Dieudonné’s theorem 166 
Hahn—Banach theorem 131 
operator norm 342 


561 


quotient norm 181-3 
quotient X/M 173 
subspace of 129, 132-4, 366 
weak topology 130, 165 
weighted L? spaces 516-17 
norm-—null sequence 167 
notations 
bounded operators on X 109, 194 
differentiability 97 
examples of C*-algebras 324—5 
‘hat notation’ 135 
Hilbert spaces 356 
orthogonal projection 230 
space, complex (real) continuous 
functions with compact 
support 83 


open mapping theorem 173, 177, 
178 
operational calculus 
normal element x 327 
positive square root function 328 
unbounded selfadjoint 
operators 289, 298-9 
operator algebras see von Neumann 
algebras 
operator functions, Riemann 
integrals 286 
operator theory 129 
operators 
adjoints 165, 262, 289, 296-9, 
357 
classical Volterra 219 
closable 290, 304, 305 
closed 180, 185, 290, 291, 303-5, 
312 
compact normal operators 284 
compact operators 279-82 
conjugation operator 136 
convolution operators 531-42 
defined 173 
direct sum of the operators 237 
Hilbert-Schmidt 356 
holomorphic semigroups 542-7 
inverse 291 
linear, basics 290-3 


562 


operators (Continued) 
normal, scalar 260-2, 264 
polar decomposition 286, 355 
positive 285 
Riemann-—Liouville fractional 
integration operators 543 
Riesz— Schauder theory of compact 
operators 253 
scalar 259 
selfadjoint 294, 310 
semigroups 287 
spectral 260 
strong operator topology 173, 182, 
266, 288, 355, 362, 544 
symmetric 296, 303-4 
topology 356 
trace-class 356 
unbounded 289-322 
unbounded selfadjoint, operational 
calculus 289, 298-9 
weak operator topology 182-3, 247, 
356 
ordered sampling 411, 416-17 
orthogonal decomposition 229, 256 
orthogonal decomposition 
theorem 263, 296, 306 
orthogonal projections 228, 256 
orthogonal vectors 225 
orthonormal bases/sets 225-8, 231-4 
complete/maximal 232 
Hilbert spaces 231-4 
properties 1-7 232 
outer measures 62—4 


P-null set 465, 466 
Paley—Wiener—Schwartz theorem 511, 
531 
Parseval 
formula, Fourier transform 506 
identity 539 
partition of unity 81-4 
Paul Levy continuity theorem 437, 
483 
point mass measure 263 
point spectrum 291 


Index 


pointwise convergence in norm 1, 173 
Poisson 
distribution 428, 453, 482, 484 
integrals 243 
population 453 
random variables 428 
polar decomposition 286 
operators 355 
von Neumann algebras 363-4 
polarization identity 32 
polynomial operational calculus 199, 
267 
population, using ‘sample 
outcomes’ 448 
positive elements 327-32 
positive linear functionals 84-91, 
335-40 
C*-algebras 335-40 
and convexity 346-50 
positive measures/semi-algebras 8-10, 
59-61 
space (X, A, pL) 137 
positive operators 285 
positive square root function 
328 
predual, defined 356 
probability distributions 424-33 
probability space 410, 411-24 
probability theory 409-90 
characteristic functions 433-41 
conditional probability 462-75 
estimation and decision 450-62 
heuristics 409-11 
infinite divisibility 481—5 
probability distributions 424-33 
probability space 411-24 
sequences of random 
variables 485-9 
series of L? random 
variables 475-81 
vector-valued random 
variables 441—50 
product measures 71-6 
projections 
complementary projection 228 
Hilbert spaces 228-30 


Index 


orthogonal projection 229 

von Neumann algebras 356 
pure states 324, 346-7 
Pythagoras’s theorem 226 


quadratic forms 289-90, 307-9, 
321-2, 473 

quasi-distribution functions 426, 
434-5, 436 

quasi-nilpotent commuting 260 

element of a C*-algebra 210 

quotient map 182 

quotient norm 181 

quotient X/M, normed space 173 


Radon-Nikodym derivative 98, 99 
Radon—Nikodym theorem 37, 40, 467 
random sampling 410-11, 416-17 
random variables 
Bernoulli 416, 428, 440-1 
bounded sequence 440, 477 
conditioning 467-71 
discrete random variables 428 
distribution function 425-6 
equidistributed random 
variables 441 
Gamma-distributed 432 
hypergeometric random 
variable 417 
independent integrable 413-14 
L? random variables 414, 475-81 
Poisson-type 428 
probability theory 485-9 
real independent integrable 414 
sequences 411, 485-9 
series of L? random variables 414, 
475-9 
vector-valued 441—50 
rectangles, measurable, 
semi-algebra 71 
reflexive embedding 129 
reflexive space, necessarily 
complete 134 
reflexivity 134-7 


563 


all L? spaces for 1 < p< 
infinity 137 
rejection region, statistical test 457 
renorming method/theorem 265-7 
representation 
algebra homomorphism 274 
Riesz—Markov theorem 81, 91-3, 
107, 114, 116-18 
representation theorems 35, 107, 126, 
258, 271-3, 309, 310, 323-4, 
509 
see also Riesz representation 
theorem 
residual spectrum 292 
resolution of the identity 
selfadjoint 256 
for T 260, 298 
on Z 270-3 
resolvent 
identity 292 
of a semigroup generator 311-15 
set 291 
of « 197 
restriction map, Banach space 134 
Riemann integrable functions 1, 59, 
fal 
operator functions 286 
Riemann sphere 300 
Riemann vs Lebesgue 70-1 
Riemann-—Liouville 
fractional integration operators 543 
semigroup 546-7 
Riemann-Stieltjes integral 52, 426, 
485 
Riesz projection 277 
Riesz representation theorem 107, 
258, 271-3, 309, 323-4 
representation theorem, ‘little’ 35, 
126, 310 
Riesz—Dunford integral 274 
Riesz—Markov representation 
theorem 81, 91-3, 107, 114, 
116-18 
Riesz—Markov theorem, alternative 
construction of Lebesgue 
theorem 91-3 


564 Index 


Riesz—Schauder theorem 253, 280 Snedecor density 449, 450 
Runge theorem 282 Snedecor distribution 462 
Sobolev’s space 518 
space e(w) 499 

spatial isomorphism 376 
spectral measure 254 


sample error 453 
sample mean 419 


sample space 411 spectral operators 260 
sampling spectral radius 197 
ordered sampling 411, 417 spectral set 278 
random sampling 410-11, 416-17 spectral theorem 287 
scalar operator 259 for bounded normal 
Schauder theorem 253, 280 operators 260-2 
Schwartz space 503-4 normal operators 260-2, 264 
Schwarz inequality 136, 257, 414, 427, resolution of the identity 264, 287 
507, 512, 516 for unbounded selfadjoint 
selfadjoint operators 289, 296-8 
element, C*-algebra 210 spectrum 
idempotent 352 integral 256 
matrices 253 isolated points 277-8 
of operators 294, 310 mapping theorem 212, 276 
resolution of identity 256 for polynomials 199 
spectral measures 256, 260 parts of 260-3 
semi-inner product 225, 340 representation 264—5 
semi-measures on of X (relative to T) 265 
semi-algebras 59-61, 67 residual 262 
semi-normed space 1 selfadjoint element of 
semi-simplicity space Z 267—70 C*-algebra 210 
unbounded operators in Banach standard deviation 414 
space 289, 300-3 *_-homomorphisms see isomorphism 
semigroups star-algebra (*-algebra) 
continuous in the u.o.t. 312 see B*-algebras; C*-algebras; 
convergence 316 W*-algebras 
generator 311 states, positive linear functional of 
resolvent 312 norm one 324 
of operators 287 statistic, defined 444 
separating vector space, linear statistical test 
functionals 143 power of 457 
separation 137—40 rejection region 457 
convex sets in vector spaces 137 Stieltjes integral 485 
separation theorem 142 Stirling’s formula 449 
strict 142 Stone’s theorem 407 
sesquilinearity Stone-Weierstrass theorem 130, 
inner product 226 150-2,.171, 210, 211, 264 
map 240 strong operator topology 173, 182, 
o-additivity 70, 263 266, 288, 355, 362, 544 


o-algebras 2, 424—5 strong pointwise convergence 182 


Index 


Student density, with v degrees of 
freedom 447-8, 456, 461 
Student distribution 447-8, 456, 
461-2 
sublinear operators 152 
submartingale convergence 
theorem 489 
submartingale sequence 488 
‘success frequencies’ 418, 452 
sum, distribution of 442 
summation formula 197 
superior limit (lim sup) 4 
support of a measure 97 
supremum norm 254 
uniform norm 6 
symmetric extension 304, 306, 307 
symmetric operators 295, 296, 318 
in Hilbert space 303-7 
symmetric extension of T 304, 306, 
307 
unbounded 304, 306, 307 
symmetric subspace 304 


T 
positive and negative deficiency 
spaces 305 
resolution of the identity for 260, 
263, 265, 287, 319 
t-density with v degrees of 
freedom 447 
t-test see Student distribution 
Taylor formula 438-9, 516 
Taylor theorem, 
non-commutative 285 
Tchebichev’s inequality 418, 421, 423 
temperate distributions 503-19 
Fourier transform 508 
temperate weights 514 
tensor products 
of C*-algebras 383-97, 403-5 
of Hilbert spaces 239 
of vector spaces 240-1 
of von Neumann algebras 383-97, 
403-5 


565 


0, estimator of 8 451 
Tonelli’s theorem 76, 154, 443, 540 
topological 
group 118 
space 81 
vector spaces 129, 140-2 
continuous linear functional 
f 142 
see also strong operator topology; 
weak topologies 
trace-class operators 356, 368, 
371-4 
translation operators 501 
translation-invariant positive 
measure 118 
linear functional 125 
triangular inequality 6 
truncation of function, measure 
theory 53-4 
Tychonoff’s theorem 121, 144 


unbounded operators 289-322 
basics 290-3 
on Hilbert spaces 319 
operational calculus 289, 298-9 
self-adjoint 289, 298-9 
unbounded symmetric operators 
see Symmetric extension 
uniform boundedness theorem 173, 
175-6 
uniform convexity 170 
uniform norm 6 
uniform operator topology 182 
uniqueness theorem 434 
unit circle 269 
universality property of universal 
representation 7 345 
Urysohn’s lemma, for locally compact 
Hausdorff space 82 


V, conjugate-linear and 
norm-preserving 136 

variable coefficients 528-31 

variance 430, 455, 461 


566 


vector measures, E(-)x 255 
vector space X 
complex 307 
with Hausdorff topology 140 
vector spaces 129 
tensor products 240-1 
separating family of linear 
functionals 129 
see also locally convex topological 
vector spaces 
Volterra operator 219 
von Neumann algebras 355-82 
commutants 359-62 
commutative 375-6 
cyclic vector 360 
defined 355-6 
density 362-3 
double commutant theorem 307, 
355, 377 
enveloping, of a C*-algebra 356, 
376-9 
Hilbert-Schmidt and trace-class 
operators 368-75 
polar decomposition 363-4 
preliminaries 356-9 
separating vector 360 


Index 


tensor product 383-97, 403-5 
W*-algebras 356, 365-8 


W*-algebras 356, 365-8 
unique predual 365-8 
weak law of large numbers 419 
weak topologies 143-6 
weak*-topology 130, 144, 165-9 
weak operator topology 173, 182-3, 
247, 356 
Weierstrass approximation 
theorem 150 
see also Stone-Weierstrass theorem 
weighted spaces 
L? spaces 516 
Wor spaces 514-19 


Z, resolution of the identity on 255, 
257, 270-3 

Z = X 257 

Z-valued additive set function 255 

zero hypothesis 450, 452 

zero-one law 424 

Zorn’s lemma 131, 147, 265 


